fix: optimize AgentIds and ServiceIds storage by using chunked BytesN…#262
fix: optimize AgentIds and ServiceIds storage by using chunked BytesN…#262AbuJulaybeeb wants to merge 2 commits into
Conversation
…<32> and u64 pages
📝 WalkthroughWalkthroughBoth contracts move from monolithic ID vectors to page-indexed storage. Agent IDs now use ChangesAgents contract: paged BytesN storage
Services contract: paged service ID and category storage
Sequence Diagram(s)sequenceDiagram
participant register_service
participant Storage
participant list_services_page
register_service->>Storage: append service_id to ServiceIdsPage(page_index)
register_service->>Storage: increment CategoryCounter(category)
register_service->>Storage: append service_id to ServiceIdsByCategoryPage(category, cat_page_index)
list_services_page->>Storage: read ServiceIdsPage or ServiceIdsByCategoryPage pages
list_services_page->>Storage: load ServiceEntry records
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@contract/agents/src/lib.rs`:
- Around line 231-245: The agent ID pagination logic in the agent registration
path only renews TTL for the newly written page, which leaves earlier pages like
page 0 to expire and break list traversal. Update the registration flow around
the AgentIdsBytesPage storage in the agent ID append logic so existing pages are
also kept alive, either by renewing all relevant ID pages when appending or by
adding a TTL-refresh step in the read/maintenance path used by list_agents
before it relies on AgentCount.
- Around line 583-592: In the pagination logic inside the listing function in
lib.rs, `page_size == 0` can make `end` equal `start` and cause `end - 1` to
underflow when computing the page range. Add an early guard before the
`start_page`/`end_page` calculation to return an empty result or otherwise
reject zero page sizes, and keep the existing `start`/`end` pagination flow
intact for valid values.
- Around line 129-154: The address conversion helpers in address_to_bytes and
bytes_to_address are losing the original Address variant, so contract-style
addresses can’t round-trip correctly. Update the serialization path to either
explicitly reject non-account addresses in address_to_bytes or preserve enough
variant information to reconstruct the same Address in bytes_to_address. Make
sure the AgentEntry lookup flow in list_agents* still uses a stable key that
matches what was originally stored.
In `@contract/src/lib.rs`:
- Around line 216-253: The pagination logic in the service listing function is
slicing by raw stored ID positions while filtering out inactive entries
afterward, so pages can under-fill and skip active services; fix this in the
listing path by making pagination operate on active-only results or by
maintaining an active-only counter/index instead of using
Counter/CategoryCounter in the positional window. Update the retrieval loop in
the service list function so it continues collecting active ServiceEntry values
until page_size is reached, and ensure deactivate_service/index handling stays
consistent with the chosen pagination model.
- Line 229: The `start_page` calculation in `contract/src/lib.rs` has a
redundant type cast because `start` is already `u32`; update the paging logic in
this block to remove the no-op `as u32` while keeping the `end_page` cast intact
since `end` is `usize`. This should be a simple cleanup around the `start_page`
and `end_page` computations in the same function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 9901874c-8239-44dd-a58c-1331e65aa609
📒 Files selected for processing (29)
contract/agents/src/lib.rscontract/agents/test_snapshots/test/test_admin_deactivate_agent_requires_admin.1.jsoncontract/agents/test_snapshots/test/test_admin_deactivate_agent_succeeds.1.jsoncontract/agents/test_snapshots/test/test_consecutive_days_reset_logic.1.jsoncontract/agents/test_snapshots/test/test_daily_reset_boundary_exact_threshold.1.jsoncontract/agents/test_snapshots/test/test_daily_reset_boundary_one_before_and_after.1.jsoncontract/agents/test_snapshots/test/test_deactivate_agent_still_works_for_owner.1.jsoncontract/agents/test_snapshots/test/test_flag_agent_fails_for_non_admin.1.jsoncontract/agents/test_snapshots/test/test_flag_agent_owner_cannot_flag.1.jsoncontract/agents/test_snapshots/test/test_flag_agent_requires_auth.1.jsoncontract/agents/test_snapshots/test/test_flag_agent_succeeds_with_admin.1.jsoncontract/agents/test_snapshots/test/test_spending_allowance_respects_reset.1.jsoncontract/agents/test_snapshots/test/test_update_policy_handles_reset_correctly.1.jsoncontract/src/lib.rscontract/test_snapshots/test/test_cooldown_is_per_agent_and_per_service.1.jsoncontract/test_snapshots/test/test_get_reputation_bounds.1.jsoncontract/test_snapshots/test/test_list_services_category_filter.1.jsoncontract/test_snapshots/test/test_list_services_category_filter_with_reputation.1.jsoncontract/test_snapshots/test/test_list_services_inactive_filtered.1.jsoncontract/test_snapshots/test/test_list_services_nonexistent_category.1.jsoncontract/test_snapshots/test/test_list_services_reputation_sorting.1.jsoncontract/test_snapshots/test/test_list_services_single_entry.1.jsoncontract/test_snapshots/test/test_list_services_tied_reputation.1.jsoncontract/test_snapshots/test/test_update_reputation_clamped_at_max.1.jsoncontract/test_snapshots/test/test_update_reputation_clamped_at_min.1.jsoncontract/test_snapshots/test/test_update_reputation_cooldown_blocks_rapid_repeat_votes.1.jsoncontract/test_snapshots/test/test_update_reputation_positive_and_negative.1.jsoncontract/test_snapshots/test/test_update_reputation_requires_caller_auth.1.jsoncontract/test_snapshots/test/test_update_reputation_requires_registered_agent.1.json
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
contract/src/lib.rs (1)
214-268: 🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy liftReputation ordering is only applied within the current page slice.
A higher-reputation service on a later page can still appear after lower-reputation entries on earlier pages, so this endpoint does not return a global ranking. If the intent is “top services first,” pagination needs to happen after sorting the full result set.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@contract/src/lib.rs` around lines 214 - 268, The service listing logic in the pagination loop only sorts the already-collected page slice, so the returned order is not a true global reputation ranking. Update the retrieval flow in the function that builds the `services` vector so it gathers all eligible active `ServiceEntry` items first, sorts them by `reputation` descending, and then applies `skip_active`/`page_size` slicing afterward. Keep the change localized around the `max_page` scan, `active_count` handling, and the final insertion-sort block.
🧹 Nitpick comments (1)
contract/src/lib.rs (1)
227-250: 🚀 Performance & Scalability | 🔵 TrivialPagination cost grows with total service count and page depth.
To serve page N, the loop iterates pages
0..=max_pageand performs aService(id)storage read for every ID up to the skip point, discarding skipped/inactive entries. Since deactivated IDs are intentionally retained in the chunked indices (Line 356), the number of storage reads perlist_services_pagecall scales with the total (active + inactive) services scanned, not withpage_size. Under Soroban's per-invocation resource/read budget, deep pages or a large inactive ratio could push this call toward the read limit and start failing.Consider maintaining an active-only index (or compacting indices on deactivation) so listing reads scale with returned results rather than total history.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@contract/src/lib.rs` around lines 227 - 250, The pagination in the service listing logic is doing a full scan over stored page IDs and individual Service reads, so `list_services_page` scales with total history instead of the requested page size. Update the listing path in `contract/src/lib.rs` (the loop over pages and `ServiceEntry` lookups) to use an active-only index or compact the stored indices when entries are deactivated, so skipped/inactive services do not require per-item storage reads during pagination.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@contract/agents/src/lib.rs`:
- Around line 240-244: The TTL renewal in the agent ID paging flow is unbounded
because the registration path iterates through every historical page in the
AgentIdsBytesPage loop. Update this logic in the registration/ID-page
maintenance code to bound work per call, for example by extending only the
current page and a rotating maintenance cursor or by refreshing pages that were
actually accessed through paginated reads. Keep the fix localized to the TTL
extension path so registration cost stays constant as page_index grows.
In `@contract/src/lib.rs`:
- Line 239: The `services.len() as u32 == page_size` check in the pagination
logic is using a redundant cast; update the affected comparison(s) in the
`services` collection handling to compare `services.len()` directly against
`page_size`. Keep the change localized to the length check in the relevant
loop/condition and remove the unnecessary `as u32` conversion wherever it
appears.
---
Outside diff comments:
In `@contract/src/lib.rs`:
- Around line 214-268: The service listing logic in the pagination loop only
sorts the already-collected page slice, so the returned order is not a true
global reputation ranking. Update the retrieval flow in the function that builds
the `services` vector so it gathers all eligible active `ServiceEntry` items
first, sorts them by `reputation` descending, and then applies
`skip_active`/`page_size` slicing afterward. Keep the change localized around
the `max_page` scan, `active_count` handling, and the final insertion-sort
block.
---
Nitpick comments:
In `@contract/src/lib.rs`:
- Around line 227-250: The pagination in the service listing logic is doing a
full scan over stored page IDs and individual Service reads, so
`list_services_page` scales with total history instead of the requested page
size. Update the listing path in `contract/src/lib.rs` (the loop over pages and
`ServiceEntry` lookups) to use an active-only index or compact the stored
indices when entries are deactivated, so skipped/inactive services do not
require per-item storage reads during pagination.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ebbf5489-8873-49e0-9abd-49542c072dcc
📒 Files selected for processing (2)
contract/agents/src/lib.rscontract/src/lib.rs
| // Extend TTL for all pages to keep old ID pages alive | ||
| for p in 0..=page_index { | ||
| let p_key = DataKey::AgentIdsBytesPage(p); | ||
| env.storage().persistent().extend_ttl(&p_key, MAX_TTL, MAX_TTL); | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift
Bound the ID-page TTL renewal work.
This fixes the old-page expiry case, but every registration now refreshes all historical pages. As agent count grows, registration cost grows unbounded and can hit Soroban budget limits. Prefer a bounded strategy, such as refreshing the current page plus a rotating maintenance cursor, or refreshing only pages touched by paginated reads.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@contract/agents/src/lib.rs` around lines 240 - 244, The TTL renewal in the
agent ID paging flow is unbounded because the registration path iterates through
every historical page in the AgentIdsBytesPage loop. Update this logic in the
registration/ID-page maintenance code to bound work per call, for example by
extending only the current page and a rotating maintenance cursor or by
refreshing pages that were actually accessed through paginated reads. Keep the
fix localized to the TTL extension path so registration cost stays constant as
page_index grows.
| if entry.active { | ||
| if active_count >= skip_active { | ||
| services.push_back(entry); | ||
| if services.len() as u32 == page_size { |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🌐 Web query:
soroban-sdk 22 Vec len return type u32
💡 Result:
In the soroban-sdk (including version 22), the len method of the Vec struct returns a u32 [1][2]. This method returns the number of items currently stored in the vector [1][2].
Citations:
- 1: https://docs.rs/soroban-sdk/latest/soroban_sdk/struct.Vec.html
- 2: https://github.com/stellar/rs-soroban-sdk/blob/main/soroban-sdk/src/vec.rs
🏁 Script executed:
sed -n '220,260p' contract/src/lib.rs | cat -nRepository: Stellar-Ecosystem/lodestar
Length of output: 2093
Remove the redundant as u32 casts. Vec::len() already returns u32, so both services.len() as u32 == page_size checks can be simplified to services.len() == page_size.
🧰 Tools
🪛 Clippy (1.96.0)
[warning] 239-239: casting to the same type is unnecessary (u32 -> u32)
(warning)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@contract/src/lib.rs` at line 239, The `services.len() as u32 == page_size`
check in the pagination logic is using a redundant cast; update the affected
comparison(s) in the `services` collection handling to compare `services.len()`
directly against `page_size`. Keep the change localized to the length check in
the relevant loop/condition and remove the unnecessary `as u32` conversion
wherever it appears.
Source: Linters/SAST tools
What was done
Replaced Vec
with Vec<BytesN<32>> for the DataKey::AgentIdsBytesPage(u32) storage keys in the lodestar-agents contract, storing only the raw 32-byte Ed25519 public keys.Handled proper back-and-forth serialization in Rust between BytesN<32> and SDK Address via XDR slicing in the bytes_to_address and address_to_bytes helper methods.
Fixed tests and legacy keys in lodestar-registry ensuring DataKey::ServiceIdsPage correctly stores chunked 8-byte u64 entries.
Validated the backend API (listAgents) which inherently pulls the pre-reconstructed AgentEntry struct values from the contract endpoints without requiring modifications to JS decoding layers.
Why it was done
Each Address value inside an ScVal::Vec incurs an XDR overhead of 48-56 bytes per entry, severely capping the max agents threshold before hitting the 64KB persistent storage limit (issue #85). Transitioning to raw 32-byte keys saves approximately 40% of storage capacity. Chunking into 500-entry pages distributes the load securely, comfortably keeping payload sizes near 16KB per page.
How it was verified
cargo test in both contract/ and contract/agents/ runs fully green (all compilation errors resolved, legacy setup_service test helpers fixed).
Backend integration points confirm that no parsing errors persist.
closes #247
Summary by CodeRabbit
New Features
Bug Fixes
Tests