[Store] feat: store key-value data in buckets #968

zhuxinjie-nz · 2025-10-26T12:25:39Z

Summary

Implement a SequentialStorageBackend to store key-value data in buckets for efficient data management and maintenance.

Motivation

Group KV data into buckets to avoid creating too many small files.
Add metadata support in file storage for KV data to enable LRU/LFU/creation-time-based eviction later.

Behavior & Compatibility

The SequentialStorageBackend currently supports only low-level bucket I/O and is not fully integrated into the pipeline. No impact on Mooncake in its current state.
The follow-up implementation approach is inconsistent with the current StorageBackend's design pattern.Key inconsistencies include:
1. On write: create buckets by synchronizing the offloading queue with the master via a heartbeat mechanism, enabling better scalability for independently deployed clients and supporting high-availability client scenarios.
2. On read: enable distributed read capability for bucket data via global metadata registration, coordinated through the Transfer Engine.

… data in buckets for efficient data management and maintenance.

gemini-code-assist · 2025-10-26T12:25:55Z

Summary of Changes

Hello @zhuxinjie-nz, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational SequentialStorageBackend to the Mooncake store, enabling key-value data to be organized into buckets. This change is motivated by the need to optimize data management by grouping related data, thereby reducing file overhead and preparing the system for future advanced eviction strategies. While the new backend provides low-level I/O capabilities, it is not yet integrated into the main Mooncake pipeline, ensuring no immediate impact on existing operations.

Highlights

New SequentialStorageBackend: Introduced a new SequentialStorageBackend class designed to store key-value data in logical 'buckets' for more efficient management and to reduce the number of small files.
Metadata Structures: Added SequentialObjectMetadata, SequentialBucketMetadata, and SequentialOffloadMetadata to support detailed tracking of object and bucket properties, paving the way for advanced eviction policies like LRU/LFU.
Bucket Operations: Implemented core functionalities for the new backend, including BatchOffload, BatchQuery, BatchLoad, GetBucketKeys, BucketScan, and GetStoreMetadata.
Error Handling: Expanded the ErrorCode enum with new values specific to bucket operations, such as BUCKET_NOT_FOUND, BUCKET_ALREADY_EXISTS, and KEYS_ULTRA_BUCKET_LIMIT.
Unit Tests: Added comprehensive unit tests in storage_backend_test.cpp to validate the functionality of the SequentialStorageBackend, covering data offloading, querying, loading, and bucket scanning.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a SequentialStorageBackend for managing key-value data in buckets, which is a solid approach to handle numerous small files efficiently. The implementation includes functionalities for batch operations like offloading, querying, and loading, and is accompanied by new tests, which is commendable.

However, the review has uncovered several critical thread-safety issues. Specifically, the bucket ID generation mechanism is not thread-safe, and several methods that modify shared data structures do so without adequate locking, creating potential race conditions and data corruption risks. Additionally, there are opportunities to improve error handling by properly checking the results of functions returning tl::expected instead of unsafely calling .value().

I have provided detailed comments on these points with suggestions for remediation. Addressing these concerns will significantly improve the robustness and reliability of the new storage backend.

gemini-code-assist · 2025-10-26T12:27:40Z

mooncake-store/src/storage_backend.cpp

+            LOG(ERROR) << "Storage backend already initialized";
+            return tl::unexpected(ErrorCode::INTERNAL_ERROR);
+        }
+        std::shared_lock lock(mutex_);


This function modifies shared class members (object_bucket_map_, buckets_, total_size_) under a std::shared_lock. A shared_lock is intended for concurrent read access only. Using it during write operations leads to a race condition and undefined behavior. You must use a std::unique_lock to ensure exclusive access and maintain thread safety.

std::unique_lock lock(mutex_);

gemini-code-assist · 2025-10-26T12:27:40Z

mooncake-store/src/storage_backend.cpp

+    if (!write_result) {
+        LOG(ERROR) << "vector_write failed for: " << bucket_id
+                  << ", error: " << write_result.error();
+        buckets_.erase(bucket_id);


The buckets_ map is being modified here with erase() without holding a lock on mutex_. Since buckets_ is a shared resource accessed by multiple threads, this operation is not thread-safe and can lead to race conditions or data corruption. You should acquire a std::unique_lock before modifying the map.

{ std::unique_lock lock(mutex_); buckets_.erase(bucket_id); }

gemini-code-assist · 2025-10-26T12:27:40Z

mooncake-store/src/storage_backend.cpp

+tl::expected<int64_t, ErrorCode> SequentialStorageBackend::CreateBucketId() {
+    auto cur_time_stamp = time_gen();
+    if(cur_time_stamp <= m_i64LastTimeStamp){
+        m_i64SequenceID = (m_i64SequenceID + 1) & SEQUENCE_MASK;
+    } else{
+        m_i64SequenceID = 0;
+    }
+    m_i64LastTimeStamp = cur_time_stamp;
+    return (cur_time_stamp  << TIMESTAMP_SHIFT)
+        | (m_i64SequenceID << SEQUENCE_ID_SHIFT);
+}


This function for creating bucket IDs is not thread-safe. It reads and modifies the shared member variables m_i64LastTimeStamp and m_i64SequenceID without any synchronization. If CreateBucketId is called concurrently from multiple threads (e.g., via BatchOffload), it can lead to race conditions and result in duplicate bucket IDs. You must protect this critical section with a mutex.

tl::expected<int64_t, ErrorCode> SequentialStorageBackend::CreateBucketId() { std::unique_lock lock(mutex_); auto cur_time_stamp = time_gen(); if(cur_time_stamp <= m_i64LastTimeStamp){ m_i64SequenceID = (m_i64SequenceID + 1) & SEQUENCE_MASK; } else{ m_i64SequenceID = 0; } m_i64LastTimeStamp = cur_time_stamp; return (cur_time_stamp << TIMESTAMP_SHIFT) | (m_i64SequenceID << SEQUENCE_ID_SHIFT); }

gemini-code-assist · 2025-10-26T12:27:41Z

mooncake-store/tests/storage_backend_test.cpp

+        char *buf = new char[object_it->second.size+1];
+        buf[object_it->second.size] = '\0';
+        memcpy(buf, object_it->second.ptr, object_it->second.size);
+        auto data = std::string(buf);


This code allocates memory for buf using new char[] but never deallocates it, causing a memory leak in the test. Additionally, it's inefficient as it involves multiple data copies. A safer and more efficient approach is to construct the std::string directly from the buffer pointer and size. This avoids both the memory leak and the unnecessary memcpy.

Suggested change

char *buf = new char[object_it->second.size+1];

buf[object_it->second.size] = '\0';

memcpy(buf, object_it->second.ptr, object_it->second.size);

auto data = std::string(buf);

std::string data(static_cast<char*>(object_it->second.ptr), object_it->second.size);

gemini-code-assist · 2025-10-26T12:27:41Z

mooncake-store/src/storage_backend.cpp

+        LOG(ERROR) << "batch object is empty";
+        return tl::make_unexpected(ErrorCode::INVALID_KEY);
+    }
+    auto bucket_id = CreateBucketId().value();


Calling .value() on a tl::expected without first checking if it contains a value is unsafe. If CreateBucketId() were to return an error, this would throw a tl::bad_expected_access exception and crash the program. It's crucial to handle the error case properly. This pattern of unchecked .value() calls appears multiple times in this file and should be addressed throughout.

auto bucket_id_result = CreateBucketId(); if (!bucket_id_result) { return tl::make_unexpected(bucket_id_result.error()); } auto bucket_id = bucket_id_result.value();

gemini-code-assist · 2025-10-26T12:27:41Z

mooncake-store/include/storage_backend.h

+    tl::expected<void, ErrorCode> BatchQuery(const std::vector<std::string>& keys,
+                                             std::unordered_map<std::string, SequentialObjectMetadata> &batche_object_metadata);


There's a typo in the parameter name batche_object_metadata. It should likely be batch_object_metadata for consistency with standard terminology.

std::unordered_map<std::string, SequentialObjectMetadata> &batch_object_metadata);

gemini-code-assist · 2025-10-26T12:27:41Z

mooncake-store/include/storage_backend.h

+    int64_t m_i64SequenceID = 0;
+    int64_t m_i64LastTimeStamp = 0;


These member variables use Hungarian notation (m_i64...), which is generally discouraged in modern C++. It would be more consistent with the rest of the class to use a trailing underscore for private members (e.g., sequence_id_, last_timestamp_).

int64_t sequence_id_ = 0; int64_t last_timestamp_ = 0;

gemini-code-assist · 2025-10-26T12:27:41Z

mooncake-store/src/storage_backend.cpp

+    }
+    std::unique_lock lock(mutex_);
+    total_size_ += bucket->data_size + bucket->meta_size;
+    for (auto key:bucket->keys) {


This loop copies each key string from bucket->keys. To improve performance by avoiding unnecessary allocations and copies, you should iterate using a const reference.

for (const auto& key:bucket->keys) {

gemini-code-assist · 2025-10-26T12:27:42Z

mooncake-store/src/storage_backend.cpp

+    auto file = std::move(open_file_result.value());
+    for (const auto& key : keys) {
+        size_t offset;
+        auto slice = batched_slices[key];


Using the [] operator on batched_slices is risky because it will insert a default-constructed Slice if the key doesn't exist, which could hide bugs. It's safer to use find() and check for the key's existence, or use at() which would throw an exception if the key is missing, making debugging easier.

auto slice_it = batched_slices.find(key); if (slice_it == batched_slices.end()) { LOG(ERROR) << "Slice for key " << key << " not found in batched_slices"; return tl::make_unexpected(ErrorCode::INVALID_KEY); } auto& slice = slice_it->second;

gemini-code-assist · 2025-10-26T12:27:42Z

mooncake-store/src/storage_backend.cpp

+tl::expected<std::string, ErrorCode> SequentialStorageBackend::GetBucketDataPath(int64_t bucket_id) {
+    std::string sep = storage_path_.empty() || storage_path_.back() == '/' ? "" : "/";
+    return storage_path_ + sep + std::to_string(bucket_id);
+}


This function is declared to return tl::expected<std::string, ErrorCode>, but it never actually returns an error. The implementation can be simplified by changing the return type to std::string. This also applies to GetBucketMetadataPath.

std::string SequentialStorageBackend::GetBucketDataPath(int64_t bucket_id) { std::string sep = storage_path_.empty() || storage_path_.back() == '/' ? "" : "/"; return storage_path_ + sep + std::to_string(bucket_id); }

ykwd · 2025-10-27T06:57:21Z

Thanks for the great work!

While reviewing the PR, I found several parts of the implementation that I couldn’t fully understand in terms of the underlying motivation and design reasoning.

Would it be possible to provide a high-level design description, along with an explanation of the bucket data structure and how it fits into the overall storage architecture? That would really help make the review more effective and ensure we’re aligned on the intended design direction.

xiaguan · 2025-10-28T02:17:42Z

Maybe you could resolve Gemini's Critical and High Priority comments. For the lock, you could try GUARDED_BY (you can search for it—we already use it), and use clang++ to compile; you'll get proper thread-safety analysis.

zhuxinjie-nz · 2025-10-28T02:40:40Z

Maybe you could resolve Gemini's Critical and High Priority comments. For the lock, you could try GUARDED_BY (you can search for it—we already use it), and use clang++ to compile; you'll get proper thread-safety analysis.

Thank you, I'll fix this issue.

…disk space

…sk space

ykwd

Thanks for this work! I have left some comments.

ykwd · 2025-10-28T04:27:13Z

mooncake-store/include/storage_backend.h

+     * - total_size_: cumulative data size of all stored objects
+     */
+    mutable std::shared_mutex mutex_;
+    std::string storage_path_;


We can use the "GUARDED_BY" to ensure these objects are accessed correctly.

ykwd · 2025-10-28T07:17:46Z

mooncake-store/include/storage_backend.h

                                             FileMode mode) const;
 };

+class SequentialStorageBackend {


"Sequential" in this context is a little bit misleading. Shall we consider using another name? e.g., BucketStorageBackend

ykwd · 2025-10-28T07:20:29Z

mooncake-store/src/storage_backend.cpp

+    auto meta_path = GetBucketMetadataPath(id).value();
+    auto open_file_result = OpenFile(meta_path, FileMode::Read);
+    if (!open_file_result) {
+        LOG(INFO) << "Failed to open file for reading: " << meta_path;


This shall be a log(error) message

ykwd · 2025-10-28T07:22:58Z

mooncake-store/src/storage_backend.cpp

+        return tl::make_unexpected(ErrorCode::FILE_OPEN_FAIL);
+    }
+    auto file = std::move(open_file_result.value());
+    LOG(INFO) << "Writing bucket with path: " << bucket_data_path;


If there is no error, outputting some logs for every successful operation would result in too many logs being output, which would disturb users. Consider using vlog(1) for debugging. Same apply to other places

ykwd · 2025-10-28T07:35:21Z

mooncake-store/src/storage_backend.cpp

+    auto write_bucket_result = WriteBucket(bucket_id, bucket, iovs);
+    if (!write_bucket_result) {
+        LOG(ERROR) << "Failed to write bucket with id: " << bucket_id;
+        buckets_.erase(bucket_id);


Shouldn’t we add the bucket information to buckets_ after all files have been written?

Because once we add it to buckets_, read requests can already access it even though the data hasn’t been fully written yet.

If the write operation fails, then this bucket shouldn’t be added to buckets_ at all.

Since the bucket ID is used as the file name in the storage backend, reserving a slot in buckets_ upfront prevents multiple threads from concurrently writing to the same bucket file under concurrent access.

A read operation can only proceed after the key has been inserted into object_bucket_map_.

Holding the lock from before writing until write completion would result in excessively long lock duration.

ykwd · 2025-10-28T07:38:34Z

mooncake-store/src/storage_backend.cpp

+    LOG(INFO) << "Writing bucket with path: " << bucket_data_path;
+
+    auto write_result = file->vector_write(iovs.data(), iovs.size(), 0);
+    if (!write_result) {


If write fails, shall we try to delete this file? It seems like this file will never be used. This also applies to the metadata file.

If write fails, shall we try to delete this file? It seems like this file will never be used. This also applies to the metadata file.

If FILE_WRITE_FAIL occurred, the PosixFile destructor will delete the file

I see. That makes sense.

…torageBackend

stmatengss · 2025-10-29T06:11:38Z

Use clang-format to pass format checking. @zhuxinjie-nz

zhuxinjie-nz · 2025-10-29T07:55:46Z

Use clang-format to pass format checking. @zhuxinjie-nz

A file was missing, now fixed

xiaguan · 2025-10-30T11:10:12Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a BucketStorageBackend for more efficient key-value data management by grouping data into buckets. The implementation is well-structured, including a new SharedMutex wrapper, a BucketIdGenerator, and comprehensive unit tests. My review focuses on improving robustness and maintainability by addressing a critical issue with serialization, improving error handling consistency, removing unsafe coding patterns, and fixing a test that invokes undefined behavior.

gemini-code-assist · 2025-10-30T11:12:56Z

mooncake-store/include/storage_backend.h

+YLT_REFL(BucketObjectMetadata, offset, key_size, data_size);
+
+struct BucketMetadata {
+    mutable std::shared_mutex statistics_mutex;


The BucketMetadata struct contains a std::shared_mutex statistics_mutex. std::shared_mutex is not serializable, but this struct is marked for serialization with YLT_REFL. This will likely cause compilation errors or undefined behavior at runtime when struct_pb::to_pb is called on a BucketMetadata instance. Since statistics_mutex does not appear to be used, it should be removed to avoid this issue. If a mutex is needed for BucketMetadata instances, it should be managed separately from the serializable struct.

fix @xiaguan

gemini-code-assist · 2025-10-30T11:12:56Z

mooncake-store/src/storage_backend.cpp

+        } else {
+            LOG(ERROR) << "Key " << key << " does not exist";
+        }


In BatchQuery, when a key is not found, an error is logged, but the function continues and returns a success status. This is inconsistent with BatchLoad, which returns ErrorCode::INVALID_KEY in a similar scenario. This behavior can be misleading for callers. Consider returning an error, such as ErrorCode::OBJECT_NOT_FOUND, if any key is not found.

} else { LOG(ERROR) << "Key " << key << " does not exist"; return tl::make_unexpected(ErrorCode::OBJECT_NOT_FOUND); }

fix @xiaguan

gemini-code-assist · 2025-10-30T11:12:56Z

mooncake-store/src/storage_backend.cpp

+                        GetBucketDataPath(bucket_id).value();
+                    auto bucket_meta_path =
+                        GetBucketMetadataPath(bucket_id).value();


The code frequently uses .value() on tl::expected objects (e.g., GetBucketDataPath(bucket_id).value()) without first checking if they contain a value. This is a fragile pattern that can lead to crashes if the function is ever modified to return an error. It's safer to check for an error before accessing the value. This pattern appears in multiple places within this file (Init, WriteBucket, StoreBucketMetadata, etc.).

fix @xiaguan

gemini-code-assist · 2025-10-30T11:12:56Z

mooncake-store/src/storage_backend.cpp

+    auto bucket_data_path = GetBucketDataPath(bucket_id);
+    return *bucket_data_path + ".meta";


Unsafe dereferencing of tl::expected. The GetBucketDataPath function returns a tl::expected, which could contain an error. Dereferencing it with * without checking for an error first is unsafe and will lead to a crash if an error is present. You should check for the error before dereferencing.

auto bucket_data_path = GetBucketDataPath(bucket_id); if (!bucket_data_path) { return tl::make_unexpected(bucket_data_path.error()); } return bucket_data_path.value() + ".meta";

fix @xiaguan

gemini-code-assist · 2025-10-30T11:12:56Z

mooncake-store/src/storage_backend.cpp

+BucketStorageBackend::BuildBucket(
+    const std::unordered_map<std::string, std::vector<Slice>>& batch_object,
+    std::vector<iovec>& iovs) {
+    SharedMutexLocker lock(&mutex_);


The BuildBucket method acquires an exclusive lock on mutex_, but it doesn't access any shared member variables of the BucketStorageBackend class. This lock is unnecessary and could create a performance bottleneck by serializing bucket-building operations. This method could be a static helper function or a free function that doesn't take a lock.

fix @xiaguan

gemini-code-assist · 2025-10-30T11:12:56Z

mooncake-store/src/storage_backend.cpp

+        auto bucket = buckets_.find(bucket_id);
+        if (bucket == buckets_.end()) {
+            LOG(ERROR) << "Failed to open file for reading: "
+                       << storage_filepath;
+            return tl::make_unexpected(ErrorCode::FILE_OPEN_FAIL);
+        }
+        auto object_metadata = buckets_[bucket_id]->object_metadata.find(key);
+        if (object_metadata == buckets_[bucket_id]->object_metadata.end()) {
+            LOG(ERROR) << "Failed to open file for reading: "
+                       << storage_filepath;
+            return tl::make_unexpected(ErrorCode::FILE_OPEN_FAIL);
+        }


There are a couple of issues in this section:

Misleading Errors: When a bucket or object metadata is not found, the code logs a generic "Failed to open file for reading" message and returns ErrorCode::FILE_OPEN_FAIL. The message and error code should be more specific, like "Bucket not found" (BUCKET_NOT_FOUND).

Redundant Lookup: buckets_.find(bucket_id) is called, and then buckets_[bucket_id] is used, which performs a second lookup. It's more efficient to use the iterator from the find call.

Here is a suggested change that addresses both points.

auto bucket_it = buckets_.find(bucket_id); if (bucket_it == buckets_.end()) { LOG(ERROR) << "Bucket not found with id: " << bucket_id; return tl::make_unexpected(ErrorCode::BUCKET_NOT_FOUND); } auto object_metadata = bucket_it->second->object_metadata.find(key); if (object_metadata == bucket_it->second->object_metadata.end()) { LOG(ERROR) << "Object metadata not found for key '" << key << "' in bucket " << bucket_id; return tl::make_unexpected(ErrorCode::OBJECT_NOT_FOUND); }

gemini-code-assist · 2025-10-30T11:12:57Z

mooncake-store/tests/mutex_test.cpp

+        EXPECT_TRUE(
+            mtx.try_lock_shared());  // Multiple shared locks should be allowed
+        // Note: This test does not attempt recursive locking (UB), just checks
+        // concurrent shared access.


This test attempts to acquire a shared lock on mtx while it's already held by a SharedMutexLocker in the same thread. Calling try_lock_shared on a std::shared_mutex that is already owned by the calling thread (in any mode) results in undefined behavior. This test should be removed or rewritten to test concurrency from a different thread, similar to how SharedAccessIsConcurrent is implemented.

xiaguan · 2025-10-30T12:15:32Z

mooncake-store/include/storage_backend.h

+
+struct StorageObjectMetadata {
+    int64_t bucket_id;
+    size_t offset;


Use uint64_t or int64_t since size_t varies in size across different machines.

I think int64_t is better?

xiaguan · 2025-10-30T12:43:04Z

So here are my several points of confusion:

We need a doc (or something similar) that shows the on-disk storage format for each "Bucket" (as you call it in your design) when we persist key/value pairs to disk. For example: the first 8 bytes are the magic number, bytes 8–16 are the checksum, etc. We use 4 bytes for key length and 8 bytes for value length, and so on.
We need a benchmark for this—similar to LevelDB's db_bench —showing this SSD KV engine's write throughput and read throughput under Mooncake Store's use case. For instance, with value = 1 MB, how fast are batch_put and batch_load (at batch sizes of 8, 16, 32)? Can it hit the SSD’s max throughput?

I think we could also refer to Cachelib's Navy Block Cache documentation for more best practices on efficiently storing large key/value pairs on SSD: https://cachelib.org/docs/Cache_Library_Architecture_Guide/navy_overview#block-cache

ykwd · 2025-10-30T13:15:14Z

The code looks good to me. We can merge this pr at the moment and leave the further optimization to subsequent work.

[Store] feat: implement a SequentialStorageBackend to store key-value…

c749715

… data in buckets for efficient data management and maintenance.

gemini-code-assist bot reviewed Oct 26, 2025

View reviewed changes

fix(store): The test directory for StorageBackendTest does not exist.

754fa85

zhuxinjie-nz mentioned this pull request Oct 27, 2025

[RFC]: [Store]support distributed read and write operations to SSDs #971

Open

1 task

zhuxinjie-nz added 2 commits October 27, 2025 16:42

Store: Fix test data path issue in StorageBackendTest

80aca15

Store: Fix a memory leak issue in StorageBackendTest

5b4c4e1

ykwd self-requested a review October 28, 2025 02:08

ykwd added 4 commits October 28, 2025 03:40

Limit thread number in ci make to avoid ci failure out of not enough …

6653ffa

…disk space

Limit thread number in ci make to avoid ci failure out of not enough …

33d43c2

…disk space

Remove previous build output to avoid ci failure out of not enough di…

f08ef11

…sk space

try to optimize ci disk space

20a79fc

ykwd reviewed Oct 28, 2025

View reviewed changes

zhuxinjie-nz added 4 commits October 28, 2025 20:20

Store: Implement SharedMutex Thread Safety Analysis and apply it to S…

f6d861c

…torageBackend

Store: Rename SequentialStorageBackend to BucketStorageBackend

c25c9ad

fix(store): StorageBackend change log level from INFO to ERROR

c2b4803

fix(store): Protect bucket ID variables with mutex

c6db961

zhuxinjie-nz added 2 commits October 29, 2025 15:58

fix(store): Optimize bucket ID generation logic

f4fb4a3

fix(store): Optimize the metadata structure of BucketStorageBackend

6e364b8

gemini-code-assist bot reviewed Oct 30, 2025

View reviewed changes

xiaguan reviewed Oct 30, 2025

View reviewed changes

fix(store): BucketStorageBackend size_t to int64_t

adbd969

xiaguan approved these changes Oct 30, 2025

View reviewed changes

ykwd merged commit 2cf86bf into kvcache-ai:main Oct 31, 2025
11 checks passed

		tl::expected<void, ErrorCode> BatchQuery(const std::vector<std::string>& keys,
		std::unordered_map<std::string, SequentialObjectMetadata> &batche_object_metadata);

		auto bucket_data_path = GetBucketDataPath(bucket_id);
		return *bucket_data_path + ".meta";

[Store] feat: store key-value data in buckets #968

[Store] feat: store key-value data in buckets #968

Uh oh!

Conversation

zhuxinjie-nz commented Oct 26, 2025

Summary

Motivation

Behavior & Compatibility

Uh oh!

gemini-code-assist bot commented Oct 26, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

ykwd commented Oct 27, 2025

Uh oh!

xiaguan commented Oct 28, 2025

Uh oh!

zhuxinjie-nz commented Oct 28, 2025

Uh oh!

ykwd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stmatengss commented Oct 29, 2025

Uh oh!

zhuxinjie-nz commented Oct 29, 2025