-
Notifications
You must be signed in to change notification settings - Fork 305
feat(idgen): Start Implementation of NoSQL with the ID Generation Framework #2131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1104fec
44a3791
c32b81a
5110375
5f4b8bf
c350b84
d6fe01f
b9fd62b
8ee7530
5b14b90
9ea81a1
d898854
d93cda9
42aa801
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
# Unique ID generation framework and monotonic clock | ||
|
||
Provides a framework and implementations for unique ID generation, including a monotonically increasing timestamp/clock | ||
source. | ||
|
||
Provides a | ||
[Snowflake-IDs](https://medium.com/@jitenderkmr/demystifying-snowflake-ids-a-unique-identifier-in-distributed-computing-72796a827c9d) | ||
implementation. | ||
|
||
Consuming production should primarily leverage the `IdGenerator` and `MonotonicClock` interfaces. | ||
|
||
## Snowflake ID source | ||
|
||
The Snowflake ID source is configurable for each backend instance, but cannot be modified for an existing backend | ||
instance to prevent ID conflicts. | ||
|
||
The epoch of these timestamps is 2025-03-01-00:00:00.0 GMT. Timestamps occupy 41 bits at | ||
millisecond precision, which lasts for about 69 years. Node-IDs are 10 bits, which allows 1024 concurrently active | ||
"JVMs running Polaris". 12 bits are used by the sequence number, which then allows each node to generate 4096 IDs per | ||
millisecond. One bit is reserved for future use. | ||
|
||
Node IDs are leased by every "JVM running Polaris" for a period of time. The ID generator implementation guarantees | ||
that no IDs will be generated for a timestamp that exceeds the "lease time". Leases can be extended. The implementation | ||
leverages atomic database operations (CAS) for the lease implementation. | ||
|
||
ID generators must not use timestamps before or after the lease period nor must they re-use an older timestamp. This | ||
requirement is satisfied using a monotonic clock implementation. | ||
|
||
## Code structure | ||
|
||
The code is structured into multiple modules. Consuming code should almost always pull in only the API module. | ||
|
||
* `polaris-idgen-api` provides the necessary Java interfaces and immutable types. | ||
* `polaris-idgen-impl` provides the storage agnostic implementation. | ||
* `polaris-idgen-mocks` provides mocks for testing. | ||
* `polaris-idgen-spi` provides the necessary interfaces to construct ID generators. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
plugins { | ||
id("org.kordamp.gradle.jandex") | ||
id("polaris-server") | ||
} | ||
|
||
description = "Polaris ID generation API" | ||
|
||
dependencies { | ||
compileOnly(libs.jakarta.annotation.api) | ||
compileOnly(libs.jakarta.validation.api) | ||
compileOnly(libs.jakarta.inject.api) | ||
compileOnly(libs.jakarta.enterprise.cdi.api) | ||
|
||
compileOnly(libs.smallrye.config.core) | ||
compileOnly(platform(libs.quarkus.bom)) | ||
compileOnly("io.quarkus:quarkus-core") | ||
|
||
compileOnly(project(":polaris-immutables")) | ||
annotationProcessor(project(":polaris-immutables", configuration = "processor")) | ||
|
||
implementation(platform(libs.jackson.bom)) | ||
implementation("com.fasterxml.jackson.core:jackson-databind") | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.apache.polaris.ids.api; | ||
|
||
/** The primary interface for generating a contention-free ID. */ | ||
public interface IdGenerator { | ||
/** Generate a new, unique ID. */ | ||
long generateId(); | ||
|
||
/** Generate the system ID for a node, solely used for node management. */ | ||
long systemIdForNode(int nodeId); | ||
|
||
default String describeId(long id) { | ||
return Long.toString(id); | ||
} | ||
|
||
IdGenerator NONE = | ||
new IdGenerator() { | ||
@Override | ||
public long generateId() { | ||
throw new UnsupportedOperationException("NONE IdGenerator cannot generate IDs."); | ||
} | ||
|
||
@Override | ||
public long systemIdForNode(int nodeId) { | ||
throw new UnsupportedOperationException("NONE IdGenerator cannot generate IDs."); | ||
} | ||
}; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.apache.polaris.ids.api; | ||
|
||
import com.fasterxml.jackson.databind.annotation.JsonDeserialize; | ||
import com.fasterxml.jackson.databind.annotation.JsonSerialize; | ||
import io.smallrye.config.WithDefault; | ||
import java.util.Map; | ||
import org.apache.polaris.immutables.PolarisImmutable; | ||
import org.immutables.value.Value; | ||
|
||
@PolarisImmutable | ||
@JsonSerialize(as = ImmutableIdGeneratorSpec.class) | ||
@JsonDeserialize(as = ImmutableIdGeneratorSpec.class) | ||
public interface IdGeneratorSpec { | ||
@WithDefault("snowflake") | ||
String type(); | ||
|
||
Map<String, String> params(); | ||
|
||
@PolarisImmutable | ||
interface BuildableIdGeneratorSpec extends IdGeneratorSpec { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This class looks odd? Afaict we could use |
||
static ImmutableBuildableIdGeneratorSpec.Builder builder() { | ||
return ImmutableBuildableIdGeneratorSpec.builder(); | ||
} | ||
|
||
@Override | ||
Map<String, String> params(); | ||
|
||
@Override | ||
@Value.Default | ||
default String type() { | ||
return "snowflake"; | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.apache.polaris.ids.api; | ||
|
||
import java.time.Instant; | ||
|
||
/** | ||
* Provides a clock providing the current time in milliseconds, microseconds and instant since | ||
* 1970-01-01-00:00:00.000. The returned timestamp values increase monotonically. | ||
* | ||
* <p>The functions provide nanosecond/microsecond/millisecond precision, but not necessarily the | ||
* same resolution (how frequently the value changes) - no guarantees are made. | ||
* | ||
* <p>Implementation <em>may</em> adjust to wall clocks advancing faster than the real time. If and | ||
* how exactly depends on the implementation, as long as none of the time values available via this | ||
* interface "goes backwards". | ||
* | ||
* <p>Implementer notes: {@link System#nanoTime() System.nanoTime()} does not guarantee that the | ||
* values will be monotonically increasing when invocations happen from different | ||
* CPUs/cores/threads. | ||
* | ||
* <p>A default implementation of {@link MonotonicClock} can be injected as an application scoped | ||
* bean in CDI. | ||
*/ | ||
public interface MonotonicClock extends AutoCloseable { | ||
/** | ||
* Current timestamp as microseconds since epoch, can be used as a monotonically increasing wall | ||
* clock. | ||
*/ | ||
long currentTimeMicros(); | ||
|
||
/** | ||
* Current timestamp as milliseconds since epoch, can be used as a monotonically increasing wall | ||
* clock. | ||
*/ | ||
long currentTimeMillis(); | ||
|
||
/** | ||
* Current instant with nanosecond precision, can be used as a monotonically increasing wall | ||
* clock. | ||
*/ | ||
Instant currentInstant(); | ||
|
||
/** Monotonically increasing timestamp with nanosecond precision, not related to wall clock. */ | ||
long nanoTime(); | ||
|
||
void sleepMillis(long millis); | ||
|
||
@Override | ||
void close(); | ||
dimas-b marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
void waitUntilTimeMillisAdvanced(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could be good to add javadocs here (and above), it's not immediately clear what this method is supposed to do (spin-wait until the clock ticks?). Also neither this method nor |
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.apache.polaris.ids.api; | ||
|
||
import jakarta.annotation.Nonnull; | ||
import java.time.Instant; | ||
import java.util.UUID; | ||
|
||
public interface SnowflakeIdGenerator extends IdGenerator { | ||
/** Offset of the snowflake ID generator since the 1970-01-01T00:00:00Z epoch instant. */ | ||
Instant ID_EPOCH = Instant.parse("2025-03-01T00:00:00Z"); | ||
|
||
/** | ||
* Offset of the snowflake ID generator in milliseconds since the 1970-01-01T00:00:00Z epoch | ||
* instant. | ||
*/ | ||
long ID_EPOCH_MILLIS = ID_EPOCH.toEpochMilli(); | ||
|
||
int DEFAULT_NODE_ID_BITS = 10; | ||
int DEFAULT_TIMESTAMP_BITS = 41; | ||
int DEFAULT_SEQUENCE_BITS = 12; | ||
|
||
long constructId(long timestamp, long sequence, long node); | ||
|
||
long timestampFromId(long id); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suppose this is understood as "since the Snowflake Epoch (2025-03-01)". Could be good to add javadocs to clarify. |
||
|
||
long timestampUtcFromId(long id); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Timestamp UTC" does not make sense to me 🤔 |
||
|
||
long sequenceFromId(long id); | ||
|
||
long nodeFromId(long id); | ||
|
||
UUID idToTimeUuid(long id); | ||
|
||
String idToString(long id); | ||
|
||
long timeUuidToId(@Nonnull UUID uuid); | ||
|
||
int timestampBits(); | ||
|
||
int sequenceBits(); | ||
|
||
int nodeIdBits(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ID expected to be the same for all
IdGenerator
implementations? What if a future impl. is not node-based?Should this be pushed down to the Snowflake ID code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function isn't specific to the particular implementation.
See the upcoming node-id-lease stuff: there is one implementation with one constant configuration per setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the more that I read about the node-id lease stuff. I don't know whether this should be in this module. Here's my thinking:
What do y'all think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Adam's point 1.
... however, I have a bigger concern. Suppose we run with Snowflake IDs for a while and then change to another ID generator. Assume
generateId()
outputs do not clash. Still, do we expectsystemIdForNode(X)
to return the same value for all generator implementations and for all possible values ofX
?