-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(typing): add np.ndarray
aliases
#1977
Conversation
Will close #1972
Noticed a few cases that use the original guard and combined with an `ndim == ...` Now we get that in one call with type narrowing
Tried to resolve a variance issue in `PandasLikeDataFrame.to_numpy`, but abandoned for now
- Sub-package has a huge amount of unrelated warnings - Tried to fix *only* those that are related to `np.ndarray`
narwhals/_arrow/dataframe.py
Outdated
import pandas as pd | ||
import polars as pl | ||
from pyarrow._stubs_typing import Indices # type: ignore # noqa: PGH003 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import is refering to https://github.com/zen-xu/pyarrow-stubs/blob/d97063876720e6a5edda7eb15f4efe07c31b8296/pyarrow-stubs/_stubs_typing.pyi#L27
That type is causing a lot of the headaches with pyarrow
:
Indices: TypeAlias = list[int] | NDArray[np.integer] | IntegerArray
Where we are using Sequence[int]
it causes a conflict, but AFAICT list[int]
is overly narrow on their end
@@ -207,28 +217,30 @@ def __getitem__( | |||
isinstance(item, tuple) | |||
and len(item) == 2 | |||
and is_sequence_but_not_str(item[1]) | |||
and not isinstance(item[0], str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggled to follow the logic here, but this seems like it should've been there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should already have been validated at the narwhals/dataframe.py
level - but sure, doesn't hurt to be more explicit about what we're dealing with when we get here π
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoGorelli I really think all these CompliantDataFrame.__getitem__
methods would benefit from some reusable code (per #1942)
def is_numpy_array_1d(arr: Any) -> TypeIs[_1DArray]: | ||
"""Check whether `arr` is a 1D NumPy Array without importing NumPy.""" | ||
return is_numpy_array(arr) and arr.ndim == 1 | ||
|
||
|
||
def is_numpy_array_2d(arr: Any) -> TypeIs[_2DArray]: | ||
"""Check whether `arr` is a 2D NumPy Array without importing NumPy.""" | ||
return is_numpy_array(arr) and arr.ndim == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suffixing the current guard name seemed reasonable, but it is a bit odd having:
array_1d
-> _1DArray
`pre-commit` inconsistent with local `mypy` https://results.pre-commit.ci/run/github/760058710/1739108468.gewbSXmfRu2T2Ml-tFh6BQ
That'll teach me for simply following what `mypy` asked me to do π« https://results.pre-commit.ci/run/github/760058710/1739106763.KUYdGvWdRC6nL6ED4bqzFQ https://results.pre-commit.ci/run/github/760058710/1739108788.r92OEb-RTUejKTFBisjIfg
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @dangotbanned ! this 1DArray / 2DArray is pretty nice, I think it would be useful in pandas-stubs too, i think i'll suggest it there
@@ -207,28 +217,30 @@ def __getitem__( | |||
isinstance(item, tuple) | |||
and len(item) == 2 | |||
and is_sequence_but_not_str(item[1]) | |||
and not isinstance(item[0], str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should already have been validated at the narwhals/dataframe.py
level - but sure, doesn't hurt to be more explicit about what we're dealing with when we get here π
@@ -819,7 +820,7 @@ def calculate_timestamp_date(s: pd.Series, time_unit: str) -> pd.Series: | |||
|
|||
def select_columns_by_name( | |||
df: T, | |||
column_names: Sequence[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
β€οΈ yes I needed this too
Thanks @MarcoGorelli! I can't take all the credit there, I'm only simplifying parts of
AFAICT, we only care about |
Closes #1972
What type of PR is this? (check all applicable)
Related issues
np.ndarray
alias(es)Β #1972Checklist
If you have comments or can explain your changes, please do so below
Resolves warnings that look like this:
On
main
, this accounts for 25 warnings: