GH-36845: [C++][Python] Allow type promotion on pa.concat_tables#36846
GH-36845: [C++][Python] Allow type promotion on pa.concat_tables#36846jorisvandenbossche merged 93 commits intoapache:mainfrom
pa.concat_tables#36846Conversation
|
|
pa.concat_tables
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Thanks a lot for the updates!
Added a few small comments (I can also push if needed to ensure we get this in).
It might be good to add a small test to ensure promote keyword still works.
python/pyarrow/table.pxi
Outdated
| Concatenate pyarrow.Table objects. | ||
|
|
||
| If promote==False, a zero-copy concatenation will be performed. The schemas | ||
| If promote_options=="none", a zero-copy concatenation will be performed. The schemas |
There was a problem hiding this comment.
| If promote_options=="none", a zero-copy concatenation will be performed. The schemas | |
| If promote_options="none", a zero-copy concatenation will be performed. The schemas |
python/pyarrow/table.pxi
Outdated
| first table. | ||
|
|
||
| If promote==True, any null type arrays will be casted to the type of other | ||
| If promote_options=="default", any null type arrays will be casted to the type of other |
There was a problem hiding this comment.
| If promote_options=="default", any null type arrays will be casted to the type of other | |
| If promote_options="default", any null type arrays will be casted to the type of other |
python/pyarrow/table.pxi
Outdated
|
|
||
| if "promote" in kwargs: | ||
| warnings.warn( | ||
| "promote has been superseded by mode='default'.", FutureWarning) |
There was a problem hiding this comment.
| "promote has been superseded by mode='default'.", FutureWarning) | |
| "promote has been superseded by mode='default'.", FutureWarning, stacklevel=2) |
python/pyarrow/types.pxi
Outdated
| Default; null and only null can be unified with another type. | ||
| Permissive; promotes types to the greater common denominator. |
There was a problem hiding this comment.
| Default; null and only null can be unified with another type. | |
| Permissive; promotes types to the greater common denominator. | |
| Default: null and only null can be unified with another type. | |
| Permissive: types are promoted to the greater common denominator. |
python/pyarrow/table.pxi
Outdated
| warnings.warn( | ||
| "promote has been superseded by mode='default'.", FutureWarning) | ||
| if kwargs['promote'] is True: | ||
| promote_options = "permissive" |
There was a problem hiding this comment.
| promote_options = "permissive" | |
| promote_options = "default" |
Isn't that needed to preserve the current default behaviour?
There was a problem hiding this comment.
I thought we agreed on promoting when promote=True, but this also works for me.
There was a problem hiding this comment.
Yeah, I mentioned that I would find a promote=True a better default, but now that we deprecated it in favor of promote_options, we don't need to change the current behaviour, as it will be removed anyway. And users can now replace it with promote_options="permissive" if they prefer that
python/pyarrow/tests/test_table.py
Outdated
| t2 = pa.Table.from_arrays( | ||
| [pa.array([1.0, 2.0], type=pa.float32())], ["float_field"]) | ||
|
|
||
| result = pa.concat_tables([t1, t2], promote=True) |
There was a problem hiding this comment.
| result = pa.concat_tables([t1, t2], promote=True) | |
| with pytest.warns(FutureWarning): | |
| result = pa.concat_tables([t1, t2], promote=True) |
This asserts the warning is raised and at the same time also ensures we don't unnecessarily see the warning in the pytest logs
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 5f57219. There were 2 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 8 possible false positives for unstable benchmarks that are known to sometimes produce them. |
|
FYI, those reported performance regressions were just flakes. The timings are still stable at the same level for later commits. |
Revival of #12000
Rationale for this change
It would be great to be able to do promotions when
concat'ing a table, such as:What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?