Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping between type variation anchors and definitions #268

Open
jvanstraten opened this issue Jul 27, 2022 · 1 comment
Open

Mapping between type variation anchors and definitions #268

jvanstraten opened this issue Jul 27, 2022 · 1 comment
Labels
help wanted No one is currently implementing but it seems like a good idea under consideration

Comments

@jvanstraten
Copy link
Contributor

jvanstraten commented Jul 27, 2022

Type variation anchor declarations identify a variation using only the extension it's defined in and its name:

message ExtensionTypeVariation {
// references the extension_uri_anchor defined for a specific extension URI.
uint32 extension_uri_reference = 1;
// A surrogate key used in the context of a single plan to reference a
// specific type variation
uint32 type_variation_anchor = 2;
// the name of the type in the defined extension YAML.
string name = 3;
}

(note that the comment also says it's a type name rather than a variation name, copypaste error I guess)

However, according to https://substrait.io/types/type_variations/:

The name used to reference this type. Should be unique within type variations for this parent type within a simple extension.

So I'm led to conclude that a type variation anchor does not uniquely identify a type variation definition. This is not necessarily broken, but it's extremely annoying to implement, because now I need to introduce some weird concept of "a set of potentially completely unrelated type variation definitions that happen to have the same name" to represent what the declaration refers to. Can we either:

  • change the protos to also require the corresponding type class when declaring a variation (which would also be annoying, because this abstraction does not currently exist in the protos);
  • do the same thing we do for disambiguating function implementations, i.e. require the variation name in the declaration to be of the form <name>:<class>; or
  • expand the name uniqueness constraint to a complete extension rather than scoped by type class

or is this like this for a good reason and I just have to deal with it?

@westonpace
Copy link
Member

Yes, this seems like something we should fix. However, I don't know that anyone is doing anything with type variations yet. So I don't see it as urgent.

I am partial to the third approach expand the name uniqueness constraint to a complete extension rather than scoped by type class. It seems, if we have a "dictionary" variation for int32 and then a "dictionary" variation for "string" it would be unpleasant if those were talking about two different kinds of variations.

I don't think there are going to be a huge set of type variations and it probably isn't too limiting to require the names be unique.

@westonpace westonpace added help wanted No one is currently implementing but it seems like a good idea under consideration labels Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted No one is currently implementing but it seems like a good idea under consideration
Projects
None yet
Development

No branches or pull requests

2 participants