Skip to content

Typing for multi-dimensional arrays #513

Open
@shoyer

Description

@shoyer

I'd like to open a discussion about typing for multi-dimensional arrays in general, and more specifically for NumPy. We have already been discussing this over in the NumPy issue tracker (numpy/numpy#7370) and recently opened a new repository to start writing type stubs (https://github.com/numpy/numpy_stubs).

To help guide discussion, I wrote a document outlining ideas for array shape typing.

To summarize:

  • We would like to be able to type-check both data types (e.g., float64) and shapes (e.g., a 3x4 array) for multi-dimensional arrays.
  • There are many uses cases where support for checks using dimension identity would be valuable, e.g., to indicate that a function transforms an array with shape (N, M) to shape (N,) for arbitrary integers N and M. These dimension variables look very similar to TypeVar, if TypeVar supported integers as types.
  • A notion of "zero or more additional dimensions" would also be quite valuable, and is a core part of the type for many NumPy operations (generalized ufuncs). This might be naturally written with Ellipsis, e.g., (...., N) for an array with a last dimension of length N and any number of proceeding dimensions. There are particular rules (broadcasting) that should be enforced for matching multiple arguments with variable numbers of dimensions.

This will likely require some new typing features (as well as type-checker support). Notably:

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic: featureDiscussions about new features for Python's type annotations

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions