Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PyArrow conversion and arrow/parquet tests for pyarrow 19.0 #60716

Merged
merged 10 commits into from
Jan 22, 2025

Conversation

jorisvandenbossche
Copy link
Member

With the upcoming pyarrow 19.0, it should start returning the future default string dtype itself, which will require some (test) changes on our side.

@jorisvandenbossche jorisvandenbossche added this to the 2.3 milestone Jan 13, 2025
@jorisvandenbossche jorisvandenbossche added Compat pandas objects compatability with Numpy or Python functions IO Parquet parquet, feather Arrow pyarrow functionality labels Jan 13, 2025
@WillAyd
Copy link
Member

WillAyd commented Jan 13, 2025

Thanks for tackling this. Maybe we should also have a note in our string guide about preferring PyArrow > 19 to ensure better type preservation?

@mroeschke mroeschke marked this pull request as ready for review January 22, 2025 02:06
@mroeschke mroeschke self-requested a review as a code owner January 22, 2025 02:06
@mroeschke
Copy link
Member

Merging to get the CI to green. Happy to have the doc note about Pyarrow interop in a follow up PR

@mroeschke mroeschke merged commit 5efac82 into pandas-dev:main Jan 22, 2025
54 checks passed
Copy link

lumberbot-app bot commented Jan 22, 2025

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.3.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 5efac8250787414ec580f0472e2b563032ec7d53
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #60716: Update PyArrow conversion and arrow/parquet tests for pyarrow 19.0'
  1. Push to a named branch:
git push YOURFORK 2.3.x:auto-backport-of-pr-60716-on-2.3.x
  1. Create a PR against branch 2.3.x, I would have named this PR:

"Backport PR #60716 on branch 2.3.x (Update PyArrow conversion and arrow/parquet tests for pyarrow 19.0)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

@jorisvandenbossche jorisvandenbossche deleted the string-dtype-pyarrow-19 branch January 22, 2025 09:55
jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this pull request Jan 22, 2025
…andas-dev#60716)

* Update PyArrow conversion and arrow/parquet tests for pyarrow 19.0

* update pypi index

* extra filterwarnings

* more test updates

* temp enable infer_string option

* Adapt test_get_handle_pyarrow_compat for pyarrow 19

* Use pa_version_under19p0 in test_get_handle_pyarrow_compat

* Adjust test_string_inference for using_infer_string

* Fix test_string_inference for feather

---------

Co-authored-by: Matthew Roeschke <[email protected]>
(cherry picked from commit 5efac82)
jorisvandenbossche added a commit that referenced this pull request Jan 22, 2025
…r pyarrow 19.0 (#60716) (#60755)

Co-authored-by: Matthew Roeschke <[email protected]>
(cherry picked from commit 5efac82)

* fixup
* don't hardcode object dtype
* also enable CoW when enabling future.infer_string
@jorisvandenbossche
Copy link
Member Author

@mroeschke thanks for finishing this!

Manual backport -> #60755

asharmalik19 pushed a commit to asharmalik19/pandas that referenced this pull request Jan 22, 2025
…andas-dev#60716)

* Update PyArrow conversion and arrow/parquet tests for pyarrow 19.0

* update pypi index

* extra filterwarnings

* more test updates

* temp enable infer_string option

* Adapt test_get_handle_pyarrow_compat for pyarrow 19

* Use pa_version_under19p0 in test_get_handle_pyarrow_compat

* Adjust test_string_inference for using_infer_string

* Fix test_string_inference for feather

---------

Co-authored-by: Matthew Roeschke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Compat pandas objects compatability with Numpy or Python functions IO Parquet parquet, feather
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants