feat: add partitioned namespace#5896
Conversation
| lance-io = { path = "../../rust/lance-io" } | ||
| lance-namespace = { path = "../../rust/lance-namespace" } | ||
| lance-namespace-impls = { path = "../../rust/lance-namespace-impls", features = ["rest", "rest-adapter"] } | ||
| lance-namespace-reqwest-client = { git = "https://github.com/wojiaodoubao/lance-namespace", branch = "rest-table-properties" } |
There was a problem hiding this comment.
After lance-format/lance-namespace#299 is merged, update it.
23d594c to
b61565e
Compare
b61565e to
dc95f18
Compare
|
|
||
| /// Request for creating multiple namespaces with a single merge insert. | ||
| #[derive(Debug, Clone)] | ||
| pub struct CreateMultiNamespacesRequest { |
There was a problem hiding this comment.
why do we need this? I thought we only need creating multiple tables in a namespace
There was a problem hiding this comment.
oh I see nvm because we want to create the namespaces that represent partition values.
There was a problem hiding this comment.
I think we should add these as a part of the Lance Namespace operations, introduce BatchCreateNamespaces, BatchCreateTables, etc., those operations can be useful anyway even outside the context of partitioned namespace.
And then you don't need a dedicated extension just for manifest namespace to make it work.
There was a problem hiding this comment.
I think we should add these as a part of the Lance Namespace operations, introduce BatchCreateNamespaces, BatchCreateTables, etc.
Agree, we can do this.
| } | ||
|
|
||
| #[derive(Debug, Default, Clone)] | ||
| pub struct CreateMultiNamespacesRequestBuilder { |
There was a problem hiding this comment.
just to create a separated thread. I am starting to think, is there real benefit in creating the sub-namespace structures? It seems purely for the purpose that it is cool to list namespaces in this way, but it does not serve any practical purposes since all the pruning are done directly against the table's partition column values in __manifest. Would it make more sense to just not have those nested namespace structures?
There was a problem hiding this comment.
Yes, having a table is sufficient for partition creation and pruning. The reason for retaining the namespace is that PartitionedNamespace is a type of DirectoryNamespace that follows to the partition spec standard. If we only keep the table part, PartitionedNamespace can no longer be treated as a normal DirectoryNamespace.
I think from a consistency perspective, it would be better to retain it. Shall we retain it, or remove it for simplicity?
This is a sub-task of the partitioned namespace