Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the design doc covering the tensor types #316

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tannergooding
Copy link
Member

No description provided.

Comment on lines +64 to +74
// IEquatable

bool Equals(TSelf other);

// IEqualityOperators

static bool operator ==(TSelf left, TSelf right);
static bool operator ==(TSelf left, T right);

static bool operator !=(TSelf left, TSelf right);
static bool operator !=(TSelf left, T right);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is equality defined? Rank, strides, and all elements are equal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It would then fail for something like comparing a 4x1 vs a 1x4 as the first is four rows vs 4 columns. A user would need to reshape to compare them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would then fail for something like comparing a 4x1 vs a 1x4

Fail as in throwing an exception? Would that carry over to the IEquatable<T> implementation or the Equals(object) method? Convention for such methods is to return false for comparands whose type or shape doesn't match.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fail would mean returning false, as is required by IEquatable and typical for equality implementations

Comment on lines +38 to +40
static TSelf Empty { get; }

bool IsEmpty { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is empty defined? If I have a two-dimensional tensor where the length of both dimensions is 0, I assume that's empty? Do we need to an empty singleton for different numbers of dimensions, or just having whatever shape Empty returns is fine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is empty defined?

All lengths are 0, just like for array.

Do we need to an empty singleton for different numbers of dimensions, or just having whatever shape Empty returns is fine?

For empty in particular, I don't think we need one per rank. Rather, it can be a special marker value that is treated as compatible with other sizes, much as scalar is implicitly broadcast to every element for many of the operations.

static TSelf Empty { get; }

bool IsEmpty { get; }
bool IsPinned { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the array-backed instances we create will not be pinned by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We would treat it just like a normal new T[] unless the user explicitly opts for it otherwise.

Comment on lines +47 to +48
static implicit operator TensorSpan<T>(TSelf value);
static implicit operator TensorReadOnlySpan<T>(TSelf value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I use any of this with e.g. TensorPrimitives APIs that take ReadOnlySpan<T> and Span<T>? I don't see any conversions from Tensor{ReadOnly}Span<T> to {ReadOnly}Span<T>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting from TensorSpan to Span is "unsafe" due to the loss of rank/strides and potential for overflow on total length.

Internally, we'd use the MemoryMarshal.CreateSpan APIs to handle it in appropriately sized chunks.

Comment on lines +47 to +48
static implicit operator TensorSpan<T>(TSelf value);
static implicit operator TensorReadOnlySpan<T>(TSelf value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that any ITensor implementation would need to always store its data contiguously? Or would an implementation dynamically change to use a contiguous implementation if it wasn't already when one of these methods was used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is currently assumed to be contiguous always yes. A user who wanted it separately should use a Tensor<T>[] instead (functionally like a jagged array). This makes it clearer when there is contiguous vs non-contiguous, better fits how some other libraries expose/handle the concept, and allows the separate allocations to then have their lifetimes managed independently.

ref T GetPinnableReference();
TSelf Slice(params ReadOnlySpan<Range<nint>> ranges);

void Clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is Clear defined? Is it the equivalent of Fill(default), or is it changing something about the rank / strides?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as Array, so yes equivalent to Fill(default) but without needing to explicitly check that the fill value is bitwise zero.

{
// Effectively mirror the TensorPrimitives surface area. Following the general pattern
// where we will return a new tensor and an overload that takes in the destination explicitly.
// * public static ITensor<T> Abs<T>(ITensor<T> x) where T : INumberBase<T>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of ITensor<T> to these methods produce? While an implementation detail, presumably it's going to end up being a Tensor<T> (except in corner cases where we might be able to use a singleton)? Would there be any benefit to finding ways to create these the same as the inputs, e.g. if you passed in a FooTensor<T> to Abs, you'd get back a new FooTensor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be desirable to allow creating the same kind of ITensor and is probably worth some more discussion.

However, we aren't actually allowing the user to provide the implementation for Abs here (to help minimize the number of interfaces, etc) and the inputs may not all be the same (consider MultiplyAdd which takes in 3 different tensors). So I think such support would likely require some new kind of ITensorAllocator (or better name) that allows the user to customize how temporaries are created. I think that gets more into the concept of how temporaries work beyond the barebones support and is what we'll want to discuss in our offline meeting.

Comment on lines +155 to +156
// Without this support, we end up having to have `TensorNumber<T>`, `TensorBinaryInteger<T>`, etc
// as we otherwise cannot correctly expose the operators based on what `T` supports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really the fallback? I'd have expected the fallback would be named extension methods until extension operators are supported, at which point we would add operators that map to the named methods. It'd be unfortunate if we shipped types we knew were going to be immediate legacy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't explain this well enough. The proposal is meant to go with what you say. Exposing explicit Add methods and let that eventually become extension operator +. If we don't do that, the alternative is to do what generic math did, which while the correct choice for the fundamental numeric interfaces, is much less ideal for the general purpose patterns built on top of that, like Tensor<T>.


// APIs that would return `bool` like `GreaterThan` are split into 3. Following the general
// pattern already established for our SIMD vector types.
// * public static ITensor<T> GreaterThan(ITensor<T> x, ITensor<T> y);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns ITensor<T> rather than ITensor<bool>? I understand why we do Vector<T> for the vector types, but that doesn't seem like the right answer here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo. This is meant to be ITensor<bool> here.

Notably this is one of the places where what's best depends on what T is. If its a primitive type and going to be vectorized, bool is a terrible option. If its a user-defined type, then its one of the better choices. Some people may even want a bitmask result instead.

I want us to discuss this a little bit and see if we also want to expose any special APIs for the case of the user wanting ITensor<T> or BitMask returned or if we want to hold off on that until there's a need. -- Functionally we'll end up implementing the former and then narrowing to byte before storage for the vectorized case regardless, so exposing it is just a question of naming the API we'll already be writing.

Comment on lines +258 to +262
public static Tensor<T> Create(bool mustPin, ReadOnlySpan<nint> lengths);
public static Tensor<T> Create(bool mustPin, ReadOnlySpan<nint> lengths, ReadOnlySpan<nint> strides);

public static Tensor<T> Create(T* address, ReadOnlySpan<nint> lengths);
public static Tensor<T> Create(T* address, ReadOnlySpan<nint> lengths, ReadOnlySpan<nint> strides);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we expect existing tensor types to integrate? e.g. TorchSharp's Tensor<T>, would the idea be that it would implement ITensor<Tensor<T>, T>, or otherwise expose such an implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Those tensor types would choose one or more of:

  • Implement ITensor<TSelf, T>
  • Provide conversions to/from TensorSpan<T>
  • Provide access to the underlying T* so that users can unsafely create a Tensor<T> or TensorSpan<T> over it

What they get out of it varies between the options, but it gives flexibility for how integrated they want to be with whatever the BCL provides here. The minimal interface should make it very easy for opt-in, even if explicit implementation only.

ref T this[params ReadOnlySpan<nint> indices] { get; }

static implicit operator TensorSpan<T>(TSelf value);
static implicit operator TensorReadOnlySpan<T>(TSelf value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ReadOnlyTensorSpan<T> and TensorReadOnlySpan<T> names are being used in this doc.


bool IsEmpty { get; }
bool IsPinned { get; }
int Rank { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of linear algebra, the term "rank" denotes the number of linearly independent rows contained within the tensor (i.e. it is a number derived from the contents of the tensor rather than its shape). Do we prefer to follow .NET MD array terminology here or use something comparable to what other tensor libraries are doing (e.g. ndim, TotalDimensions, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants