Description
At SciPy, I had the realization that it would be relatively straightforward to make element-wise math between xarray objects lazy. This would let us support lazy coordinate arrays, a feature that has quite a few use-cases, e.g., for both geoscience and astronomy.
The trick would be to write a lazy array class that holds an element-wise vectorized function and passes indexers on to its arguments. I haven't thought too hard about this yet for vectorized indexing, but it could be quite efficient for outer indexing. I have some prototype code but no tests yet.
The question is how to hook this into xarray operations. In particular, supposing that the inputs to a function do no hold dask arrays:
- Should we try to make every element-wise operation with vectorized functions (ufuncs) lazy by default? This might have negative performance implications and would be a little tricky to implement with xarray's current code, since we still implement binary operations like
+
with separate logic fromapply_ufunc
. - Should we make every element-wise operation that explicitly uses
apply_ufunc()
lazy by default? - Or should we only make element-wise operations lazy with
apply_ufunc()
if you use some special flag, e.g.,apply_ufunc(..., lazy=True)
?
I am leaning towards the last option for now but would welcome other opinions.