-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
topic-arraysrelated to flexible array supportrelated to flexible array support
Description
What is your issue?
During the scipy sprints this year I had a quick look at which parts of xarray
already work with marray
, via xarray
's array API support. In following you'll see what I tried already.
Preamble for the examples below:
import xarray as xr
import marray
import numpy as np
xr.set_options(display_expand_data=False)
rng = np.random.default_rng()
What works already:
- creating xarray objects containing masked arrays:
mnp = marray.masked_namespace(np)
data = rng.normal(size=(2000, 1000)) * 2 - 1
masked = mnp.asarray(data, mask=np.abs(data) < 0.5)
arr = xr.DataArray(
masked,
dims=("time", "x"),
coords={"time": xr.date_range("2025-07-07 08:00:00", freq="6h", periods=2000), "x": np.arange(1000)},
)
- aggregation (without nan-skipping,
marray
does that by default but doesn't implement__array_function__
fornan*
):arr.mean(dim="x", skipna=False)
- subsetting:
arr.sel(time="2025-07-09")
- reindexing:
arr.reindex(x=np.arange(-5, 15), fill_value=mnp.asarray(0, mask=True))
(this wraps the data usingmarray
, so won't work withdask
. It does allow converting non-marray
tomarray
data, though) - where:
arr.where(xr.ufuncs.abs(arr) > 2.4, mnp.asarray(0, mask=True))
- groupby aggregations:
arr.groupby("time.day").mean(skipna=False)
- stack:
arr.stack(z=("time", "x"))
- roll:
arr.roll({"time": 3})
What does not work yet:
- sortby:
arr.sortby("time")
(fails with a "cannot pickle module object") - pad / shift:
arr.pad({"x": 2}, constant_values=mnp.asarray(0, mask=True))
(pad
is not part of the array API, yet) - isnull:
arr.isnull()
(it's not quite clear whether the name refers tonan
/nat
/None
, or to truly missing values) - na-filling methods like
ffill
/bfill
/fillna
andinterpolate_na
(we might need amasked
accessor for that if we don't want to special-casemarray
) - string methods (only works on numpy arrays? But since the vlen string dtype in numpy 2 supports missing values it might not be needed?)
To make a lot of these a bit less of a mouthful we could make the default placeholder for missing values (NA
) be aware of marray
and represent a masked 0d marray, or add a missing
global to marray
's namespace such that we could pass e.g. mnp.missing
to all places where fill_value
crops up.
mdhabertrexfeathers
Metadata
Metadata
Assignees
Labels
topic-arraysrelated to flexible array supportrelated to flexible array support