-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes and enhancements to Cosmos KV host implementation #3032
base: main
Are you sure you want to change the base?
Fixes and enhancements to Cosmos KV host implementation #3032
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my comments mostly reflect my ignorance rather than actual issues. Hopefully @devigned can provide a more informed review!
document_client.delete_document().await.map_err(log_error)?; | ||
let document_client = self | ||
.client | ||
.document_client(key, &self.store_id.clone().unwrap_or(key.to_string())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check... store_id
seems to be a partition key (or PK column???) but key
seems to be a row key - is it correct to mix them? The docs imply the default store id is /id
rather than the current row key, but I may be misunderstanding how the store ID gets applied.
EDIT: oh is this aligning with Pair
's implementation of CosmosEntity::partition_key
? that feels like we should make the relationship clearer then. Maybe improve docs on self.store_id
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the store_id
inline doc to explain that /id
is the row key:
/// An optional store id to use as a partition key for all operations.
///
/// If the store id not set, the store will use `/id` (the row key) as the
/// partition key. For example, if `store.set("my_key", "my_value")` is
/// called, the partition key will be `my_key` if the store id is set to
/// `None`. If the store id is set to `Some("myappid/default"), the
/// partition key will be `myappid/default`.
store_id: Option<String>,
}
.client | ||
.document_client(key.clone(), &self.store_id) | ||
.document_client(&key, &self.store_id.clone().unwrap_or(key.to_string())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question about whether key
is the right fallback here EDIT: and same edit!
.allow_tentative_writes(TentativeWritesAllowance::Allow) | ||
.await | ||
{ | ||
if e.as_http_error() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the nesting of similar-looking code hard to follow here. It would be great to pull out some chunks and/or helpers here but I appreciate it may not be easy given retries etc!
crates/key-value-azure/src/store.rs
Outdated
@@ -274,8 +329,9 @@ impl CompareAndSwap { | |||
|
|||
#[async_trait] | |||
impl Cas for CompareAndSwap { | |||
/// `current` will fetch the current value for the key and store the etag for the record. The | |||
/// etag will be used to perform and optimistic concurrency update using the `if-match` header. | |||
/// `current` will fetch the current value for the key and store the etag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this and other doc comments needed reformatting, or is it a new lint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually lint at 80 chars so must have updated this by accident. looks like we use 100.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I think it's by and large ready to merge, but it would be nice to see the nits addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the improvements. I did mention concern about tentative writes in atomic operations. In the scenario where a swap occurs at roughly the same time on the same key, you could imagine 2 processes running and fetching the same etag version of the document, then the first committing the updated document to the hub while the second is tentatively committing the document to a spoke instance after the first. The second document committed seems like it would win with the default conflict resolution protocol of last in wins.
I'm not sure if this is the right answer, but we may want to consider sending atomic reads and writes to the hub db instance and avoid multi-region writes.
crates/key-value-azure/src/store.rs
Outdated
@@ -327,15 +383,18 @@ impl Cas for CompareAndSwap { | |||
// attempt to replace the document if the etag matches | |||
doc_client | |||
.replace_document(pair) | |||
.allow_tentative_writes(TentativeWritesAllowance::Allow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned that by allowing tentative writes, this becomes non-atomic via last in wins default conflict resolution. It seems like this operation should be targeted at the hub rather than one of the spoke db instances. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. One option would be to remove allow_tentative_writes
from all atomic operations. Right now, through runtime config, there is no way to configure a target endpoint/region so the hub endpoint is always generated anyways. This brings up the point that we should probably hold off on allow_tentative_writes
until we update runtime config to support configuring non-primary regions.
a355588
to
1856f30
Compare
Updated the PR to only contain the set, delete, and get fixes. Removed non-primary region support since there is no way to configure it in runtime config |
1856f30
to
4e3bca9
Compare
@@ -138,14 +138,18 @@ struct AzureCosmosStore { | |||
client: CollectionClient, | |||
/// An optional store id to use as a partition key for all operations. | |||
/// | |||
/// If the store id not set, the store will use `/id` as the partition key. | |||
/// If the store ID is not set, the store will use `/id` (the row key) as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Kate, this is so much clearer!
…cr ops Signed-off-by: Kate Goldenring <[email protected]>
4e3bca9
to
1a03f74
Compare
This contains some enhancements and a couple important bug fixes
Bug fixes:
store_id
configured) were not working due to using the incorrect store IDset
api which requires a string valueKeys
structure to help with this.Enhancements:
Support for writing and querying data in non-primary regions by enablingRemoving this until we enable configuring region in runtime config.allow_tentative_writes
. This is needed for databases configured with global replicas if not using the primary replica