Skip to content

Conversation

@XucSh
Copy link
Owner

@XucSh XucSh commented Dec 17, 2025

Description

Type of Change

  • Types
    • Bug fix
    • New feature
      • Transfer Engine
      • Mooncake Store
      • Mooncake EP
      • Integration
      • P2P Store
      • Python Wheel
    • Breaking change
    • CI/CD
    • Documentation update
    • Other

How Has This Been Tested?

Checklist

  • I have performed a self-review of my own code.
  • I have updated the documentation.
  • I have added tests to prove my changes are effective.

Summary by CodeRabbit

Release Notes

New Features

  • Introduced tiered memory caching system supporting multiple storage tiers (DRAM, NVMe) with configurable priorities and tags.
  • Added automatic data copying between memory tiers with intelligent fallback mechanisms when direct paths are unavailable.
  • Implemented multi-tier replication with comprehensive metadata management for efficient data allocation, storage, and retrieval across cache tiers.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

Walkthrough

Introduces a tiered storage abstraction layer for mooncake consisting of cache tier interfaces, a registry for memory-type copy operations, a data copier with direct and fallback paths, and a backend managing multi-tier allocation, replication, and metadata across diverse storage media.

Changes

Cohort / File(s) Summary
Tier Abstraction & Metadata
mooncake-store/include/tiered_cache/cache_tier.h, mooncake-store/include/tiered_cache/tiered_backend.h
Defines MemoryType enum (DRAM, NVME, UNKNOWN), DataSource struct, abstract CacheTier interface with lifecycle (Init, Allocate, Free) and introspection methods; introduces tiered storage data structures (TieredLocation, TierView, AllocationEntry, MetadataEntry) and TieredBackend orchestrator for multi-tier allocation, replication, metadata indexing, and tier management.
Copy Function Registry
mooncake-store/include/tiered_cache/copier_registry.h, mooncake-store/src/tiered_cache/copier_registry.cpp
Implements singleton CopierRegistry for registering memory-type and direct-path copy operations; provides CopierRegistrar helper for static-initialization-time registration of copiers.
Data Copier with Fallback
mooncake-store/include/tiered_cache/data_copier.h, mooncake-store/src/tiered_cache/data_copier.cpp
Implements DataCopierBuilder (populates copy matrix from global registry, validates DRAM path availability) and DataCopier (executes direct or two-step DRAM-based fallback copy between memory types).
Backend Implementation
mooncake-store/src/tiered_cache/tiered_backend.cpp
Implements TieredBackend lifecycle, allocation with tier selection, write/copy/commit operations with per-key metadata and replica tracking, concurrent Get/Delete with multi-layer locking, and tier introspection.
Build Configuration
mooncake-store/src/CMakeLists.txt
Adds three new source files (copier_registry.cpp, data_copier.cpp, tiered_backend.cpp) to MOONCAKE_STORE_SOURCES.

Sequence Diagrams

sequenceDiagram
    participant Client
    participant TieredBackend
    participant CacheTier
    participant DataCopier
    participant MemoryPool

    Client->>TieredBackend: Allocate(size, preferred_tier)
    activate TieredBackend
    TieredBackend->>TieredBackend: GetSortedTiers()
    TieredBackend->>CacheTier: Allocate(size)
    activate CacheTier
    CacheTier->>MemoryPool: allocate
    MemoryPool-->>CacheTier: ptr
    CacheTier-->>TieredBackend: AllocationHandle(tier_id, DataSource)
    deactivate CacheTier
    TieredBackend-->>Client: AllocationHandle
    deactivate TieredBackend

    Client->>TieredBackend: Write(source_data, handle)
    activate TieredBackend
    TieredBackend->>DataCopier: Copy(source, dest_from_handle)
    activate DataCopier
    DataCopier->>DataCopier: FindCopier(src_type, dest_type)
    alt Direct Path Exists
        DataCopier->>DataCopier: Execute direct copy
    else No Direct Path
        note over DataCopier: Both non-DRAM?
        DataCopier->>TieredBackend: Allocate temp DRAM buffer
        DataCopier->>DataCopier: Copy src → DRAM
        DataCopier->>DataCopier: Copy DRAM → dest
    end
    DataCopier-->>TieredBackend: success/failure
    deactivate DataCopier
    TieredBackend-->>Client: bool
    deactivate TieredBackend

    Client->>TieredBackend: Commit(key, handle)
    activate TieredBackend
    TieredBackend->>TieredBackend: Update metadata_index<br/>(per-key replicas)
    TieredBackend-->>Client: bool
    deactivate TieredBackend
Loading
sequenceDiagram
    participant Client
    participant TieredBackend
    participant MetadataIndex
    participant CacheTier1
    participant CacheTier2

    Client->>TieredBackend: Get(key, optional_tier_id)
    activate TieredBackend
    TieredBackend->>MetadataIndex: Lookup replicas for key
    activate MetadataIndex
    MetadataIndex-->>TieredBackend: replica list sorted by priority
    deactivate MetadataIndex
    
    alt tier_id specified
        TieredBackend->>CacheTier1: Retrieve from specific tier
    else best available
        TieredBackend->>CacheTier1: Retrieve from highest-priority tier
    end
    activate CacheTier1
    CacheTier1-->>TieredBackend: AllocationHandle
    deactivate CacheTier1
    TieredBackend-->>Client: AllocationHandle
    deactivate TieredBackend

    Client->>TieredBackend: Delete(key, optional_tier_id)
    activate TieredBackend
    TieredBackend->>MetadataIndex: Remove replica from key
    activate MetadataIndex
    MetadataIndex->>CacheTier2: Free allocation
    activate CacheTier2
    CacheTier2-->>MetadataIndex: success
    deactivate CacheTier2
    MetadataIndex-->>TieredBackend: bool
    deactivate MetadataIndex
    TieredBackend-->>Client: bool
    deactivate TieredBackend
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • DataCopier fallback logic (data_copier.cpp): Validate DRAM allocation during fallback, temporary buffer lifecycle, error handling paths, and copy matrix building logic with validation.
  • Concurrency and locking (tiered_backend.cpp): Review per-key shared_mutex usage, lock upgrade/downgrade patterns, metadata entry lifecycle with replicas vector, and safety of concurrent Get/Delete/Commit operations.
  • Allocation and tier selection (tiered_backend.cpp): Verify GetSortedTiers priority ordering, preferred tier fallback logic, and tier assignment to AllocationEntry.
  • Resource management: Validate AllocationEntry destructor triggering FreeInternal, shared_ptr lifecycle, and cleanup of empty metadata entries.
  • Cross-tier interactions: Ensure DataCopier integration with TieredBackend, CacheTier abstraction compliance, and TieredLocation consistency across operations.

Poem

🐰 ✨ A cache with tiers so grand and wide,

DRAM and NVMe side by side,

Copiers leap with fallback grace,

Data dancing place to place,

Metadata locked, replicas thrive—

Mooncake's tiered store comes alive! 🌙

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title mentions 'tired backend' but the changeset introduces a 'tiered' backend with comprehensive tiered cache abstractions, making the title contain a likely typo that obscures the main intent. Correct the title to '[Store] feat: introduce tiered backend' to accurately reflect the main change of adding tiered cache infrastructure.
Description check ⚠️ Warning The PR description is entirely empty except for template placeholders; no actual description, change rationale, testing details, or checklist items are provided. Fill in the description with a clear summary of changes, select appropriate type of change (New feature - Mooncake Store), describe testing performed, and check relevant checklist items.
Docstring Coverage ⚠️ Warning Docstring coverage is 17.95% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pr1212

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (9)
mooncake-store/src/CMakeLists.txt (1)

15-20: Pre-existing issue: Duplicate source file entries.

etcd_helper.cpp and ha_helper.cpp appear twice in MOONCAKE_STORE_SOURCES (lines 15-16 and 19-20). While CMake typically handles this gracefully, it's worth cleaning up to avoid confusion.

mooncake-store/include/tiered_cache/cache_tier.h (1)

6-6: Remove unused include.

<optional> is included but not used in this header.

-#include <optional>
mooncake-store/include/tiered_cache/copier_registry.h (1)

12-13: Remove redundant forward declaration.

DataCopierBuilder is forward-declared here, but data_copier.h is already included on line 4, which defines DataCopierBuilder. The forward declaration is unnecessary.

 namespace mooncake {
 
-// Forward declaration from data_copier.h to avoid circular dependency
-class DataCopierBuilder;
-
 // Holds the registration information for a memory type.
mooncake-store/src/tiered_cache/data_copier.cpp (1)

3-3: Remove unused include.

<fstream> is included but not used in this file.

-#include <fstream>
mooncake-store/include/tiered_cache/data_copier.h (2)

1-9: Minor include cleanup opportunity.

<stdexcept> and <vector> don't appear to be directly used in this header. Consider removing unused includes to reduce compilation overhead and keep dependencies minimal.


60-88: Consider adding move semantics.

While copy operations are deleted (appropriate for this class), move operations are implicitly deleted due to the deleted copy operations. If DataCopier needs to be moved (e.g., stored in containers or returned from functions), consider explicitly defaulting the move constructor and move assignment operator.

     ~DataCopier() = default;
     DataCopier(const DataCopier&) = delete;
     DataCopier& operator=(const DataCopier&) = delete;
+    DataCopier(DataCopier&&) = default;
+    DataCopier& operator=(DataCopier&&) = default;
mooncake-store/src/tiered_cache/tiered_backend.cpp (2)

172-180: Consider sorted insertion instead of sort-after-insert.

When a new replica is added, the entire vector is sorted. For small replica counts this is fine, but consider using std::lower_bound for O(log n) insertion point lookup followed by insert, which maintains sorted order without full re-sort.


241-248: Variable it shadows outer scope variable.

The inner loop variable it on line 242 shadows the outer it from line 235. While this compiles correctly, it's confusing and error-prone. Consider renaming one of them.

                 std::unique_lock<std::shared_mutex> entry_write_lock(
                     entry->mutex);
                 auto tier_it = entry->replicas.end();
-                for (auto it = entry->replicas.begin();
-                     it != entry->replicas.end(); ++it) {
-                    if (it->first == *tier_id) {
-                        tier_it = it;
+                for (auto replica_it = entry->replicas.begin();
+                     replica_it != entry->replicas.end(); ++replica_it) {
+                    if (replica_it->first == *tier_id) {
+                        tier_it = replica_it;
                         break;
                     }
                 }
mooncake-store/include/tiered_cache/tiered_backend.h (1)

24-27: Remove redundant struct keyword.

The struct keyword before DataSource is unnecessary in C++ and is a C-style declaration.

 struct TieredLocation {
     uint64_t tier_id;
-    struct DataSource data;
+    DataSource data;
 };
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31c45e4 and 930d91f.

📒 Files selected for processing (8)
  • mooncake-store/include/tiered_cache/cache_tier.h (1 hunks)
  • mooncake-store/include/tiered_cache/copier_registry.h (1 hunks)
  • mooncake-store/include/tiered_cache/data_copier.h (1 hunks)
  • mooncake-store/include/tiered_cache/tiered_backend.h (1 hunks)
  • mooncake-store/src/CMakeLists.txt (1 hunks)
  • mooncake-store/src/tiered_cache/copier_registry.cpp (1 hunks)
  • mooncake-store/src/tiered_cache/data_copier.cpp (1 hunks)
  • mooncake-store/src/tiered_cache/tiered_backend.cpp (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
mooncake-store/src/tiered_cache/data_copier.cpp (2)
mooncake-store/include/tiered_cache/data_copier.h (2)
  • DataCopierBuilder (25-53)
  • DataCopier (60-88)
mooncake-store/src/tiered_cache/copier_registry.cpp (2)
  • GetInstance (7-10)
  • GetInstance (7-7)
mooncake-store/src/tiered_cache/tiered_backend.cpp (1)
mooncake-store/include/tiered_cache/tiered_backend.h (1)
  • TieredBackend (80-193)
mooncake-store/include/tiered_cache/copier_registry.h (3)
mooncake-store/include/tiered_cache/cache_tier.h (1)
  • mooncake (10-88)
mooncake-store/include/tiered_cache/data_copier.h (2)
  • mooncake (11-90)
  • DataCopierBuilder (25-53)
mooncake-store/src/tiered_cache/copier_registry.cpp (11)
  • GetInstance (7-10)
  • GetInstance (7-7)
  • RegisterMemoryType (12-16)
  • RegisterMemoryType (12-13)
  • RegisterDirectPath (18-21)
  • RegisterDirectPath (18-19)
  • GetMemoryTypeRegistrations (23-26)
  • GetMemoryTypeRegistrations (24-24)
  • GetDirectPathRegistrations (28-31)
  • GetDirectPathRegistrations (29-29)
  • CopierRegistrar (33-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: build (3.12)
  • GitHub Check: build-musa
  • GitHub Check: build-flags (3.10)
  • GitHub Check: build (3.10)
  • GitHub Check: build-flags (3.12)
  • GitHub Check: Build Docker Image
🔇 Additional comments (19)
mooncake-store/src/CMakeLists.txt (1)

29-31: LGTM!

The new tiered cache source files are correctly added to the build.

mooncake-store/include/tiered_cache/cache_tier.h (1)

50-86: LGTM!

The CacheTier abstract interface is well-designed with clear separation of concerns for allocation, deallocation, and tier metadata accessors. The use of pure virtual methods ensures concrete implementations must provide all required functionality.

mooncake-store/include/tiered_cache/copier_registry.h (2)

35-70: Consider thread-safety for concurrent registration.

The singleton instance initialization is thread-safe (Meyers singleton), but RegisterMemoryType and RegisterDirectPath mutate internal vectors without synchronization. If static CopierRegistrar objects are initialized concurrently across threads (e.g., in dynamically loaded libraries), this could cause data races.

If all registrations occur during single-threaded static initialization (the typical use case), this is fine. Otherwise, consider adding a mutex or documenting the single-threaded initialization requirement.


80-84: LGTM!

The CopierRegistrar helper provides a clean pattern for self-registration of memory types during static initialization.

mooncake-store/src/tiered_cache/data_copier.cpp (3)

9-20: LGTM!

The constructor correctly populates the copy matrix from the global registry, establishing both to/from DRAM paths and direct paths.


29-51: LGTM!

The validation in Build() ensures all registered non-DRAM types have their required DRAM fallback paths before constructing the immutable DataCopier. Using std::unique_ptr with private constructor via new is an appropriate pattern here.


63-114: LGTM!

The Copy implementation correctly prioritizes direct paths and falls back to a two-step DRAM-based copy when needed. Error handling is thorough with appropriate logging at each failure point. The temporary buffer uses std::nothrow and checks for allocation failure.

mooncake-store/src/tiered_cache/copier_registry.cpp (4)

7-10: LGTM!

The Meyers singleton pattern provides thread-safe initialization in C++11 and later.


12-31: LGTM!

Registration and accessor methods are implemented correctly. Using std::move for the copy functions avoids unnecessary copies.


33-39: LGTM!

The CopierRegistrar constructor correctly delegates to the singleton's RegisterMemoryType, enabling static self-registration.


2-2: The #include "tiered_cache/data_copier.h" include is necessary. CopyFunction is defined in data_copier.h and is used directly in this file in multiple function signatures: RegisterMemoryType() (line 12), RegisterDirectPath() (line 18), and the CopierRegistrar constructor (line 33).

Likely an incorrect or invalid review comment.

mooncake-store/include/tiered_cache/data_copier.h (1)

25-53: LGTM!

The builder pattern is well-implemented with proper documentation. The AddDirectPath method returns a reference for chaining, and Build() correctly returns a unique_ptr for ownership transfer.

mooncake-store/src/tiered_cache/tiered_backend.cpp (5)

63-72: Potential std::out_of_range if tiers_ and tier_info_ are out of sync.

GetSortedTiers iterates over tiers_ keys but accesses tier_info_.at(). If a tier ID exists in tiers_ but not in tier_info_, this will throw. Consider iterating over tier_info_ instead (since that's what's actually populated), or add defensive checks.

 std::vector<uint64_t> TieredBackend::GetSortedTiers() const {
     std::vector<uint64_t> ids;
-    for (const auto& [id, _] : tiers_) ids.push_back(id);
+    for (const auto& [id, _] : tier_info_) ids.push_back(id);
 
     // Sort by priority descending (higher priority first)
     std::sort(ids.begin(), ids.end(), [this](uint64_t a, uint64_t b) {

74-102: LGTM with assumption.

The allocation logic correctly tries the preferred tier first, then falls back to priority-sorted tiers. This assumes tiers_ is immutable after Init() completes, which is a reasonable design for a tiered cache system.


123-129: LGTM!

The Write function correctly validates the handle and tier before delegating to DataCopier. The null check prevents crashes on invalid handles.


186-216: LGTM!

The Get function correctly uses read locks for concurrency. Returning a shared_ptr copy while holding the lock ensures the reference count is incremented atomically before the lock is released.


315-337: LGTM!

The CopyData function correctly leverages RAII for cleanup on failure paths. If Write or Commit fails, the uncommitted dest_handle will be destroyed, triggering FreeInternal to release the allocated space.

mooncake-store/include/tiered_cache/tiered_backend.h (2)

80-193: LGTM!

The TieredBackend class is well-designed with clear separation between client-centric operations, composite operations, and introspection APIs. The documentation is thorough and the use of shared_mutex for concurrent access is appropriate.


165-169: Design note: vector of pairs is appropriate here.

Using std::vector<std::pair<uint64_t, AllocationHandle>> instead of a map allows maintaining priority-sorted order, which aligns with the Get() fallback behavior returning the highest-priority replica.

Comment on lines +12 to +14
struct DataSource;
enum class MemoryType;
class TieredBackend;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove redundant forward declarations.

DataSource and MemoryType are forward-declared but then fully defined in the same header (lines 20 and 37). These forward declarations are unnecessary. Additionally, forward-declaring an enum class before its definition in the same translation unit is non-standard.

 namespace mooncake {

-struct DataSource;
-enum class MemoryType;
 class TieredBackend;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
struct DataSource;
enum class MemoryType;
class TieredBackend;
class TieredBackend;
🤖 Prompt for AI Agents
In mooncake-store/include/tiered_cache/cache_tier.h around lines 12-14, remove
the redundant forward declarations of DataSource and MemoryType (and the forward
declaration of TieredBackend if it duplicates a later definition) because
DataSource and MemoryType are fully defined later in the same header (around
lines 20 and 37) and forward-declaring an enum class in the same translation
unit is non-standard; delete those forward-declaration lines and ensure any
references still compile (no other headers rely on them here) so the file uses
the later full definitions.

Comment on lines +51 to +59
struct AllocationEntry {
TieredBackend* backend;
TieredLocation loc;

AllocationEntry(TieredBackend* b, TieredLocation l) : backend(b), loc(l) {}

// Destructor: Automatically releases the resource if valid
~AllocationEntry();
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

AllocationEntry should be non-copyable to prevent double-free.

AllocationEntry manages a resource that's freed in its destructor. Without deleted copy operations, an implicit copy could lead to double-free when both copies are destroyed.

 struct AllocationEntry {
     TieredBackend* backend;
     TieredLocation loc;

     AllocationEntry(TieredBackend* b, TieredLocation l) : backend(b), loc(l) {}
+    AllocationEntry(const AllocationEntry&) = delete;
+    AllocationEntry& operator=(const AllocationEntry&) = delete;

     // Destructor: Automatically releases the resource if valid
     ~AllocationEntry();
 };
🤖 Prompt for AI Agents
In mooncake-store/include/tiered_cache/tiered_backend.h around lines 51 to 59,
AllocationEntry currently allows implicit copies which can cause double-free;
make the class non-copyable by explicitly deleting the copy constructor and copy
assignment operator (AllocationEntry(const AllocationEntry&) = delete;
AllocationEntry& operator=(const AllocationEntry&) = delete) and if ownership
transfer is required, add move constructor and move assignment (or default them)
to allow safe moves. Ensure the rest of the code uses moves or references
instead of copying.

Comment on lines +26 to +29
} catch (const std::logic_error& e) {
LOG(FATAL) << "Failed to build DataCopier: " << e.what();
return false;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

LOG(FATAL) makes the return false unreachable.

LOG(FATAL) in glog terminates the program, so line 28 is dead code. Either downgrade to LOG(ERROR) if you want to return gracefully, or remove the return statement.

     } catch (const std::logic_error& e) {
-        LOG(FATAL) << "Failed to build DataCopier: " << e.what();
+        LOG(ERROR) << "Failed to build DataCopier: " << e.what();
         return false;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} catch (const std::logic_error& e) {
LOG(FATAL) << "Failed to build DataCopier: " << e.what();
return false;
}
} catch (const std::logic_error& e) {
LOG(ERROR) << "Failed to build DataCopier: " << e.what();
return false;
}
🤖 Prompt for AI Agents
In mooncake-store/src/tiered_cache/tiered_backend.cpp around lines 26 to 29, the
catch uses LOG(FATAL) which terminates the process and makes the subsequent
"return false;" unreachable; change the log level to LOG(ERROR) (or LOG(WARNING)
as appropriate) so the function can return false gracefully, and keep the return
statement; alternatively, if the intention is to abort the process, remove the
return statement and keep LOG(FATAL) — choose one consistent behavior and update
the code accordingly.

Comment on lines +58 to +59
LOG(INFO) << "TieredBackend initialized successfully with " << tiers_.size()
<< " tiers.";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Log message is misleading; tiers_ is never populated.

The log reports tiers_.size() but the tiers_ map is not populated anywhere (only tier_info_ is). This will always log "0 tiers" and GetSortedTiers() will return an empty vector, causing allocations to fail. Consider logging tier_info_.size() until the tier factory is implemented, or add a TODO comment clarifying this.

-    LOG(INFO) << "TieredBackend initialized successfully with " << tiers_.size()
+    LOG(INFO) << "TieredBackend initialized successfully with " << tier_info_.size()
               << " tiers.";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
LOG(INFO) << "TieredBackend initialized successfully with " << tiers_.size()
<< " tiers.";
LOG(INFO) << "TieredBackend initialized successfully with " << tier_info_.size()
<< " tiers.";
🤖 Prompt for AI Agents
mooncake-store/src/tiered_cache/tiered_backend.cpp lines 58-59: the INFO log
prints tiers_.size() but tiers_ is never populated (only tier_info_ is), so
change the log to report tier_info_.size() instead (or add a clear TODO noting
the tier factory is unimplemented and that tiers_ will be populated later); also
update/GetSortedTiers usage to rely on tier_info_ for now to avoid returning an
empty vector that breaks allocations — ensure the comment/TODO references the
factory implementation and expected future change.

return (it != tiers_.end()) ? it->second.get() : nullptr;
}

const DataCopier& TieredBackend::GetDataCopier() const { return *data_copier_; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential null dereference if Init() was not called or failed.

GetDataCopier() dereferences data_copier_ without a null check. If Init() was not called or threw during DataCopierBuilder::Build(), this will cause undefined behavior.

Consider adding an assertion or returning a pointer:

-const DataCopier& TieredBackend::GetDataCopier() const { return *data_copier_; }
+const DataCopier& TieredBackend::GetDataCopier() const {
+    CHECK(data_copier_) << "TieredBackend not initialized";
+    return *data_copier_;
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const DataCopier& TieredBackend::GetDataCopier() const { return *data_copier_; }
const DataCopier& TieredBackend::GetDataCopier() const {
CHECK(data_copier_) << "TieredBackend not initialized";
return *data_copier_;
}
🤖 Prompt for AI Agents
In mooncake-store/src/tiered_cache/tiered_backend.cpp around line 356,
GetDataCopier() dereferences data_copier_ without verifying it was initialized;
add a defensive check to avoid undefined behavior by either (preferred) keeping
the current signature but asserting/validating and throwing a clear exception
when data_copier_ is null (e.g., DCHECK/ASSERT or throw std::runtime_error with
a descriptive message) or (alternative) change the API to return a pointer
(DataCopier*) or optional so callers handle absence; implement one of these
fixes and update callers/tests if you change the return type.

@YiXR
Copy link

YiXR commented Dec 17, 2025

@coderabbitai generate docstrings

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

✅ Actions performed

Initiated docstring generation; will generate only if new commits exist.

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #5

coderabbitai bot added a commit that referenced this pull request Dec 17, 2025
Docstrings generation was requested by @YiXR.

* #4 (comment)

The following files were modified:

* `mooncake-store/include/tiered_cache/cache_tier.h`
* `mooncake-store/include/tiered_cache/copier_registry.h`
* `mooncake-store/include/tiered_cache/data_copier.h`
* `mooncake-store/include/tiered_cache/tiered_backend.h`
* `mooncake-store/src/tiered_cache/copier_registry.cpp`
* `mooncake-store/src/tiered_cache/data_copier.cpp`
* `mooncake-store/src/tiered_cache/tiered_backend.cpp`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants