GH-37050: [Python][Interchange protocol] Add a workaround for empty dataframes by AlenkaF · Pull Request #38037 · apache/arrow

AlenkaF · 2023-10-05T11:19:31Z

Rationale for this change

The implementation of the DataFrame Interchange Protocol does not currently support consumption of dataframes with 0 number of chunks (empty dataframes).

What changes are included in this PR?

Add a workaround to not error in this case.

Are these changes tested?

Yes, added test_empty_dataframe in python/pyarrow/tests/interchange/test_conversion.py.

Are there any user-facing changes?

No.

Closes: [Python] Cannot read empty DataFrame Interchange object #37050

jorisvandenbossche

Thanks!

conbench-apache-arrow · 2023-10-12T04:26:50Z

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 9afa848.

There were 21 benchmark results indicating a performance regression:

Commit Run on ursa-i9-9960x at 2023-10-11 07:26:32Z
- tpch (R) with engine=arrow, format=native, language=R, memory_map=False, query_id=TPCH-05, scale_factor=1
- tpch (R) with engine=arrow, format=parquet, language=R, memory_map=False, query_id=TPCH-04, scale_factor=1
and 19 more (see the report linked below)

The full Conbench report has more details.

…mpty dataframes (apache#38037) ### Rationale for this change The implementation of the DataFrame Interchange Protocol does not currently support consumption of dataframes with 0 number of chunks (empty dataframes). ### What changes are included in this PR? Add a workaround to not error in this case. ### Are these changes tested? Yes, added `test_empty_dataframe` in `python/pyarrow/tests/interchange/test_conversion.py`. ### Are there any user-facing changes? No. * Closes: apache#37050 Authored-by: AlenkaF <frim.alenka@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Add a workaround and a test

bc4fef5

github-actions bot added Component: Python awaiting review Awaiting review labels Oct 5, 2023

AlenkaF mentioned this pull request Oct 5, 2023

[Python] Cannot read empty DataFrame Interchange object #37050

Closed

AlenkaF added this to the 14.0.0 milestone Oct 5, 2023

AlenkaF requested a review from jorisvandenbossche October 5, 2023 13:42

jorisvandenbossche approved these changes Oct 10, 2023

View reviewed changes

jorisvandenbossche merged commit 9afa848 into apache:main Oct 10, 2023

jorisvandenbossche removed the awaiting review Awaiting review label Oct 10, 2023

github-actions bot added the awaiting merge Awaiting merge label Oct 10, 2023

AlenkaF deleted the gh-37050-empty-object-workaround branch October 10, 2023 11:20

anjakefala added the Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. label Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-37050: [Python][Interchange protocol] Add a workaround for empty dataframes#38037

GH-37050: [Python][Interchange protocol] Add a workaround for empty dataframes#38037
jorisvandenbossche merged 1 commit intoapache:mainfrom
AlenkaF:gh-37050-empty-object-workaround

AlenkaF commented Oct 5, 2023 •

edited by github-actions bot

Loading

Uh oh!

jorisvandenbossche left a comment

Uh oh!

conbench-apache-arrow bot commented Oct 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AlenkaF commented Oct 5, 2023 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Uh oh!

conbench-apache-arrow bot commented Oct 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlenkaF commented Oct 5, 2023 •

edited by github-actions bot

Loading