-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(fix): extension array indexers #9671
base: main
Are you sure you want to change the base?
Conversation
…ore/variable.py to use any-precision datetime/timedelta with autmatic inferring of resolution
…ocessing, raise now early
…t resolution, fix code and tests to allow this
for more information, see https://pre-commit.ci
… more carefully, for now using pd.Series to covert `OMm` type datetimes/timedeltas (will result in ns precision)
…rray` series creating an extension array when `.array` is accessed
Co-authored-by: Spencer Clark <[email protected]>
* Fix timedelta encoding overflow issue; always decode to ns resolution * Implement time_unit for decode_cf_timedelta * Reduce diff
Co-authored-by: Kai Mühlbauer <[email protected]>
@ilan-gold Can you rebase your changes on latest main? PR #9618 just got merged. |
I think I did. The tests look good but I'll have to re-check tomorrow - it was just mypy failing before but now I picked up an actual test failure |
Yes, looks clean. I was irritated by the amount of commits. But this will be squashed anyway, or do I miss something. The error ERROR xarray/tests/test_distributed.py::test_dask_distributed_zarr_integration_test[True-True] - Failed: 9 thread(s) were leaked from test seems unrelated. But it's also on main now. Not sure how to debug this... |
Great @kmuehlbauer - I want the maintainers to look at the MyPy. I could in theory fix it, but I would basically be guessing at what their wishes are for the classes' return types. |
) -> np.ndarray: | ||
if dtype is None: | ||
dtype = self.dtype | ||
if pd.api.types.is_extension_array_dtype(dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed? Why would someone call np.array
with an extension dtype, and then expect it to get translated to a numpy dtype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for internal usage, otherwise I wouldn't have added it. I can delete the line and then see what happens, and then comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcherian This class is basically an internal adapter so anything that asks for its data in numpy form will call this. Things like repr
, subtraction, and calling .values
on an xarray
object are a few examples
) -> np.ndarray: | ||
if dtype is None: | ||
dtype = self.dtype | ||
if pd.api.types.is_extension_array_dtype(dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. Why is this needed?
@@ -6875,7 +6875,7 @@ def groupby( | |||
[[nan, nan, nan], | |||
[ 3., 4., 5.]]]) | |||
Coordinates: | |||
* x_bins (x_bins) object 16B (5, 15] (15, 25] | |||
* x_bins (x_bins) interval[int64, right] 16B (5, 15] (15, 25] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is amazing, it enables IntervalIndex indexing now.
cc @benbovy
@Illviljan or @headtr1ck can you take a look at the typing failure please |
@@ -17,6 +17,7 @@ | |||
) | |||
|
|||
import numpy as np | |||
import pandas as pd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NamedArray is supposed to not have a dependency on pandas.
if pd.api.types.is_extension_array_dtype(data_old.dtype): | ||
# One of PandasExtensionArray or PandasIndexingAdapter? | ||
ndata = data_old.array.to_numpy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pd.api.types.is_extension_array_dtype(data_old.dtype)
does not imply data_old
is an extension array.
You probably should use some kind of isinstance
-check to be able to use .array
.
I haven't used extension arrays myself that much, why can't a simple np.asarray(data_old)
be used?
Identical to kmuehlbauer#1 - probably not very helpful in terms of changes since https://github.com/kmuehlbauer/xarray/tree/any-time-resolution-2 contains most of it....
whats-new.rst
api.rst