Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose option for setting ID function #202

Merged
merged 3 commits into from
Oct 30, 2024
Merged

Expose option for setting ID function #202

merged 3 commits into from
Oct 30, 2024

Conversation

tt
Copy link
Contributor

@tt tt commented Sep 22, 2024

AggregateState::with_id allows setting a custom ID for an aggregate.

I wanted to use a different UUID version for event IDs but those are generated right before persisting an event without a way to override:

let id: Uuid = Uuid::new_v4();

Specifically, I would prefer to use UUID version 7. Considering events are inserted chronologically, using a sequential version could improve performance. Anyway, this is about the capability and not whether any version is better.

I'm new to Rust so please let me know if there's a better way to enable this or if, say, a rename will make it more idiomatic.

@tt tt requested a review from a team as a code owner September 22, 2024 13:19
@cottinisimone
Copy link
Contributor

cottinisimone commented Sep 26, 2024

Hello and thank you for this PR. I was wondering if it makes sense and helps the code readability to create something like an ID struct with a new function that generates an Uuid (wrapped in ID). Enabling/disabling a feature (eg uuid_v4, uuid_v7) it generates the target uuid.

Eg:

pub struct Id(Uuid);

impl Id {
    #[cfg(feature = "uuid_v4")]
    pub fn new() -> Self {
        Self(Uuid::new_v4())
    }

    #[cfg(feature = "uuid_v7")]
    pub fn new() -> Self {
        Self(Uuid::new_v7())
    }
}

/cc @angelo-rendina-prima wdyt?

Copy link
Member

@MaeIsBad MaeIsBad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes, but I think that using a feature like @cottinisimone is suggesting is best(unless you want to have a different id generator for different PgStores)

@@ -26,6 +27,7 @@ where
event_buses: Vec<Box<dyn EventBus<A> + Send>>,
run_migrations: bool,
_schema: PhantomData<Schema>,
id_func: fn() -> Uuid,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rust you should generally use a generic over the Fn trait, instead of a concrete fn type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I would even suggest using a single function id generation trait

@angelo-rendina-prima
Copy link
Contributor

I like the idea of allowing ad-hoc event id generation.
I'd even argue that for most "standard" use cases, v7 could be the default (since it is chronologically sortable). The only drawback would be increased contention/spurious fails for aggregates emitting many events in short bursts.
Hence this is what I reckon:

  • we should allow per-aggregate customisation (each store should have some kind of id-generation callback)
  • we should leave v4 as the "safe" default, and users downstream can decide to squeeze extra performance by using v7 when appropriate (at their own risk and responsibility)

Given this rationale, I don't agree with @cottinisimone's proposal of standardising a unique ID type: arguably, one should use V7 as eventID (when necessary/applicable) and V4 as aggregateID.

As per @MaeIsBad's suggestion, I don't think it's unidiomatic to use a fn pointer. I haven't tried myself, but I believe using a Fn() -> Uuid trait object might be trickier to work with (needs Boxing or the like).
It might even make sense to attach some kind of generic to the PgStore (or associated type to EventStore) responsible for generating the eventID, but again I'm scared of what that could imply in terms of object safety and ergonomics.

This PR introduces a minimal, sensible and non-disruptive improvement. My only objection to the proposed change is the naming of the field, it should be very clear what it's used for: something like store_event_id_generator (just winging it, don't take it as an official proposal).

@tt
Copy link
Contributor Author

tt commented Sep 29, 2024

I wasn't exactly sure what to change so I experimented with a slightly different approach. 😅

For my use case, I'm happy that all stores use UUIDv7 but I want aggregate IDs to still use UUIDv4 exactly as @angelo-rendina-prima mentions.

I originally used fn() -> Uuid to not assume that anyone else wanted UUIDv7 specifically. The change I made is to introduce an enum which makes this slightly easier (and maybe addresses both pieces of feedback?):

  • EventIdUuidFormat::V4 will generate random UUIDs as today.
  • EventIdUuidFormat::V7 will generate time-ordered UUIDs (but does require a bump to the UUID crate and the v7 feature enabled).
  • EventIdUuidFormat::Custom allows passing a custom function for other use cases.

The benefit of EventIdUuidFormat::V7 is to provide a suggested alternative to UUIDv4. If you use this, it's easy and you don't need a direct dependency on the UUID crate.

The appeal of EventIdUuidFormat::Custom is that it can address someone having a preference for UUIDv6, ULID or something else.

Both options solve my need but I only need one of them and can drop the other one. Also still happy to revise this as needed or revert to the previous approach. Let me know!

Copy link
Contributor

@angelo-rendina-prima angelo-rendina-prima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the latest version presented here, but given that we don't actually need a general uuid generator but simply V4 | V7 I am suggesting to only allow those two formats: minimal simple change that fits your needs.

fn into(self) -> fn() -> Uuid {
match self {
EventIdUuidFormat::V4 => Uuid::new_v4,
EventIdUuidFormat::V7 => Uuid::now_v7,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. Maybe worth turning this into a UuidFormat = V4 | V7 and allow the builder to customise both EventId and AggregateId independently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how that would work gives the builder does not initialize aggregates?

I could move UuidFormat so it's defined in src/state.rs and then define AggregateState::with_id_format. I don't need that myself, though, and I think you generally would want a random ID for an aggregate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the aggregate id is not controlled by the Store. This is fine!

match self {
EventIdUuidFormat::V4 => Uuid::new_v4,
EventIdUuidFormat::V7 => Uuid::now_v7,
EventIdUuidFormat::Custom(func) => func,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with allowing for a generic generator function with no parameters is that it is going to hook into the global scope for RNG (or even worse, generate a constant value) - i.e. I feel like it would be a dangerous footgun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something you would probably catch easily in testing but I'll remove it since I only need the V7 option.

@@ -58,6 +58,7 @@ where
pub(super) transactional_event_handlers:
Vec<Box<dyn TransactionalEventHandler<A, PgStoreError, PgConnection> + Send>>,
pub(super) event_buses: Vec<Box<dyn EventBus<A> + Send>>,
pub(super) event_id_func: fn() -> Uuid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just allow for V4 | V7 this doesn't even need to be a complex type like a function (pointer/object).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change this to store the enum instead (which would also work even with a custom option). It was a leftover from the first approach and then I didn't change it thinking it might be faster to match once and not per event.

@tt
Copy link
Contributor Author

tt commented Oct 12, 2024

@angelo-rendina-prima, I pushed an update and hope that addresses your feedback.

Should a custom option ever be desired, it ended up being a pretty clean patch even with the recent changes:

diff --git a/src/store/postgres/builder.rs b/src/store/postgres/builder.rs
index 7744fc1..9e784d0 100644
--- a/src/store/postgres/builder.rs
+++ b/src/store/postgres/builder.rs
@@ -3,6 +3,7 @@ use std::sync::Arc;

 use sqlx::{PgConnection, Pool, Postgres};
 use tokio::sync::RwLock;
+use uuid::Uuid;

 use crate::bus::EventBus;
 use crate::handler::{EventHandler, TransactionalEventHandler};
@@ -18,9 +19,11 @@ use super::{PgStore, Schema};
 ///
 /// - `V4`: Uses the random UUID version 4 as defined by RFC 9562 section 5.4.
 /// - `V7`: Uses the time-ordered UUID version 7 as defined by RFC 9562 section 5.7.
+/// - `Custom`: Accepts a function used to generate UUIDs.
 pub enum UuidFormat {
     V4,
     V7,
+    Custom(fn() -> Uuid),
 }

 /// Struct used to build a brand new [`PgStore`].
diff --git a/src/store/postgres/event_store.rs b/src/store/postgres/event_store.rs
index 061df3e..865054b 100644
--- a/src/store/postgres/event_store.rs
+++ b/src/store/postgres/event_store.rs
@@ -100,6 +100,7 @@ where
         let id: Uuid = match self.inner.event_id_format {
             UuidFormat::V4 => Uuid::new_v4(),
             UuidFormat::V7 => Uuid::now_v7(),
+            UuidFormat::Custom(func) => func(),
         };

         #[cfg(feature = "upcasting")]

Copy link
Contributor

@angelo-rendina-prima angelo-rendina-prima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
Just waiting for @cottinisimone or @MaeIsBad if they have any objection or comments

@angelo-rendina-prima angelo-rendina-prima dismissed their stale review October 12, 2024 14:35

Change was applied, removing blocker

Copy link
Member

@MaeIsBad MaeIsBad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine

@cottinisimone cottinisimone merged commit ce4fdf8 into primait:master Oct 30, 2024
3 checks passed
@tt tt deleted the expose-option-for-setting-id-function branch October 30, 2024 16:04
@tt
Copy link
Contributor Author

tt commented Nov 22, 2024

@cottinisimone, would you consider releasing this as version 0.17.2? 🙏

@cottinisimone
Copy link
Contributor

@cottinisimone, would you consider releasing this as version 0.17.2? 🙏

Ehi @tt! I think @MaeIsBad or @cpiemontese might be able to help, as I'm no longer part of this organization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants