[poc] evolving catalog item IDs #34212
Draft
+144
−109
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a poc implementing an idea for resolving the current
CatalogItemId/GlobalIdmess.Roughly, the mess is twofold:
GlobalIdinstead ofCatalogItemId) for backwards compat reasons. What's more, theCatalogItemIdtype is supposed to be a SQL-level concept, but it leaks into the storage and compute crates as well.What this poc tries is to remove
CatalogItemIdagain. Here we simply make it an alias ofGlobalId, to avoid a huge amount of typing, which is why all of the implementation complexity remains (though it could be removed by a huge amount of typing). Instead of treating item IDs as a namespace separate from theGlobalIdnamespace, we declare the ID of a catalog item to be theGlobalIdof its most recent underlying collection. A necessary assumption for this is that each catalog item only has a single underlying collection (with some handwaving for MVs that have two collections that share aGlobalId) at any point in time.Fixing the item ID to the latest
GlobalIdmeans that the item ID can change over time. This is a major departure from how things worked previously. The thinking is that nothing relies on item IDs being stable, but the point of this poc is to figure out if this is actually true.If it is true, the approach has the benefit that it brings us mostly back into a world where we only have to worry about a single ID for each object. We don't quite get there for two reasons:
versionslist with its previousGlobalIds, and the catalog provides mechanisms to resolveGlobalIds for older versions through these lists. (This exists already but can be cleaned up.)mz_object_versionssystem relations, that's mostly equivalent to the currentmz_object_global_idsbut models the versioning explicitly and is presumably easier to explain to users.Motivation
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.