Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and enhancements to Cosmos KV host implementation #3032

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kate-goldenring
Copy link
Contributor

@kate-goldenring kate-goldenring commented Feb 28, 2025

This contains some enhancements and a couple important bug fixes

Bug fixes:

  • deletes (on instances without store_id configured) were not working due to using the incorrect store ID
  • increment was failing due to the value never being initially set. Now this sets the value if DNE -- this is the same as Spin's other KV hosts and it is impossible to set a number counter through the set api which requires a string value
  • gets were failing on pairs that were set with increment (because the value type was a number instead of string). Deserialized into an agnostic Keys structure to help with this.

Enhancements:

  • Support for writing and querying data in non-primary regions by enabling allow_tentative_writes. This is needed for databases configured with global replicas if not using the primary replica Removing this until we enable configuring region in runtime config.

@itowlson itowlson requested a review from devigned March 2, 2025 21:20
Copy link
Collaborator

@itowlson itowlson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comments mostly reflect my ignorance rather than actual issues. Hopefully @devigned can provide a more informed review!

document_client.delete_document().await.map_err(log_error)?;
let document_client = self
.client
.document_client(key, &self.store_id.clone().unwrap_or(key.to_string()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanity check... store_id seems to be a partition key (or PK column???) but key seems to be a row key - is it correct to mix them? The docs imply the default store id is /id rather than the current row key, but I may be misunderstanding how the store ID gets applied.

EDIT: oh is this aligning with Pair's implementation of CosmosEntity::partition_key? that feels like we should make the relationship clearer then. Maybe improve docs on self.store_id too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the store_id inline doc to explain that /id is the row key:

    /// An optional store id to use as a partition key for all operations.
    ///
    /// If the store id not set, the store will use `/id` (the row key) as the
    /// partition key. For example, if `store.set("my_key", "my_value")` is
    /// called, the partition key will be `my_key` if the store id is set to
    /// `None`. If the store id is set to `Some("myappid/default"), the
    /// partition key will be `myappid/default`.
    store_id: Option<String>,
}

.client
.document_client(key.clone(), &self.store_id)
.document_client(&key, &self.store_id.clone().unwrap_or(key.to_string()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question about whether key is the right fallback here EDIT: and same edit!

.allow_tentative_writes(TentativeWritesAllowance::Allow)
.await
{
if e.as_http_error()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the nesting of similar-looking code hard to follow here. It would be great to pull out some chunks and/or helpers here but I appreciate it may not be easy given retries etc!

@@ -274,8 +329,9 @@ impl CompareAndSwap {

#[async_trait]
impl Cas for CompareAndSwap {
/// `current` will fetch the current value for the key and store the etag for the record. The
/// etag will be used to perform and optimistic concurrency update using the `if-match` header.
/// `current` will fetch the current value for the key and store the etag
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this and other doc comments needed reformatting, or is it a new lint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually lint at 80 chars so must have updated this by accident. looks like we use 100.

Copy link
Collaborator

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I think it's by and large ready to merge, but it would be nice to see the nits addressed.

Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the improvements. I did mention concern about tentative writes in atomic operations. In the scenario where a swap occurs at roughly the same time on the same key, you could imagine 2 processes running and fetching the same etag version of the document, then the first committing the updated document to the hub while the second is tentatively committing the document to a spoke instance after the first. The second document committed seems like it would win with the default conflict resolution protocol of last in wins.

I'm not sure if this is the right answer, but we may want to consider sending atomic reads and writes to the hub db instance and avoid multi-region writes.

@@ -327,15 +383,18 @@ impl Cas for CompareAndSwap {
// attempt to replace the document if the etag matches
doc_client
.replace_document(pair)
.allow_tentative_writes(TentativeWritesAllowance::Allow)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned that by allowing tentative writes, this becomes non-atomic via last in wins default conflict resolution. It seems like this operation should be targeted at the hub rather than one of the spoke db instances. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. One option would be to remove allow_tentative_writes from all atomic operations. Right now, through runtime config, there is no way to configure a target endpoint/region so the hub endpoint is always generated anyways. This brings up the point that we should probably hold off on allow_tentative_writes until we update runtime config to support configuring non-primary regions.

@kate-goldenring kate-goldenring force-pushed the cosmos-kv-enhancements branch 2 times, most recently from a355588 to 1856f30 Compare March 3, 2025 18:22
@kate-goldenring
Copy link
Contributor Author

Updated the PR to only contain the set, delete, and get fixes. Removed non-primary region support since there is no way to configure it in runtime config

@kate-goldenring kate-goldenring force-pushed the cosmos-kv-enhancements branch from 1856f30 to 4e3bca9 Compare March 3, 2025 18:25
@@ -138,14 +138,18 @@ struct AzureCosmosStore {
client: CollectionClient,
/// An optional store id to use as a partition key for all operations.
///
/// If the store id not set, the store will use `/id` as the partition key.
/// If the store ID is not set, the store will use `/id` (the row key) as
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Kate, this is so much clearer!

@kate-goldenring kate-goldenring force-pushed the cosmos-kv-enhancements branch from 4e3bca9 to 1a03f74 Compare March 3, 2025 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants