Skip to content

[Python] Cannot read empty DataFrame Interchange object #37050

@stinodego

Description

@stinodego

Describe the bug, including details regarding any error messages, version, and platform.

Creating an empty table, converting to the interchange format, then reading it back, gives an error:

import pyarrow as pa
import pyarrow.interchange

df = pa.table([[]], names=['col1'])
dfi = df.__dataframe__()
pa.interchange.from_dataframe(dfi)
# ValueError: Must pass schema, or at least one RecordBatch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/stijn/code/polars/py-polars/.venv/lib/python3.11/site-packages/pyarrow/interchange/from_dataframe.py", line 86, in from_dataframe
    return _from_dataframe(df.__dataframe__(allow_copy=allow_copy),
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stijn/code/polars/py-polars/.venv/lib/python3.11/site-packages/pyarrow/interchange/from_dataframe.py", line 112, in _from_dataframe
    return pa.Table.from_batches(batches)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/table.pxi", line 3972, in pyarrow.lib.Table.from_batches
ValueError: Must pass schema, or at least one RecordBatch

I believe the reason for this is that dfi.num_chunks() is 0, when it should be 1 (a single, empty chunk).

Component(s)

Python

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions