feat: add multi-tenancy isolation (#244) by supreme2580 · Pull Request #255 · Stellar-Ecosystem/lodestar

supreme2580 · 2026-06-28T18:41:45Z

Add per-IP rate limiting at global (1000/15min), read (60/min), and write (10/min) levels with express-rate-limit, returning 429 RATE_LIMIT_EXCEEDED
Add per-registrant quotas tracking registrations_by_ip with MAX_SERVICES_PER_IP=10 cap, returning 403 QUOTA_EXCEEDED
Add queue depth protection to simulateAndSubmit rejecting writes at MAX_QUEUE_DEPTH=20 with 503 OVERLOADED
Add per-agent consecutive entry cap (5 max) to activity feed to prevent monopolization
Add GET /api/admin/quota-status endpoint (admin-auth required) for operator visibility
Fix pre-existing bugs: duplicate function declarations, missing server.listen, duplicate mock variable

closes: #244

Summary by CodeRabbit

New Features
- Added stronger traffic controls with separate global and read rate limits, plus updated limit error codes.
- Introduced an admin-only quota status endpoint with queue and registration snapshot details.
- Added per-IP registration quota enforcement and activity-feed per-agent entry capping.
Bug Fixes
- Block submissions when the submit queue is overloaded, and return consistent HTTP 503 for queue-overload errors.
- Improved rate-limit middleware consistency (including clearer “rate limit exceeded” responses).

- Add per-IP rate limiting at global (1000/15min), read (60/min), and write (10/min) levels with express-rate-limit, returning 429 RATE_LIMIT_EXCEEDED - Add per-registrant quotas tracking registrations_by_ip with MAX_SERVICES_PER_IP=10 cap, returning 403 QUOTA_EXCEEDED - Add queue depth protection to simulateAndSubmit rejecting writes at MAX_QUEUE_DEPTH=20 with 503 OVERLOADED - Add per-agent consecutive entry cap (5 max) to activity feed to prevent monopolization - Add GET /api/admin/quota-status endpoint (admin-auth required) for operator visibility - Fix pre-existing bugs: duplicate function declarations, missing server.listen, duplicate mock variable

coderabbitai · 2026-06-28T18:42:01Z

📝 Walkthrough

Walkthrough

Adds per-IP rate limiting, per-IP registration quotas, submit-queue depth rejection, a per-agent activity feed cap, and an admin quota-status endpoint. Config and tests are extended to cover the new limits and response codes.

Changes

Multi-tenancy Isolation Controls

Layer / File(s)	Summary
Config fields and defaults `backend/src/config.js`, `backend/src/config.test.js`	`config.rateLimit` gains `global` and `read` per-IP settings, and top-level `maxServicesPerIp`, `maxQueueDepth`, and `activityFeedMaxPerAgent` are added with env-backed positive-integer defaults and tests.
Limiter factory and middleware wiring `backend/src/middleware/rateLimiter.js`, `backend/src/middleware/rateLimiter.test.js`, `backend/src/index.js`	A shared limiter factory backs `writeRateLimiter`, `globalRateLimiter`, and `readRateLimiter`; the error code becomes `RATE_LIMIT_EXCEEDED`, and the app mounts the new global/read limiters.
Queue depth guard and quota status `backend/src/lib/contract.js`, `backend/src/lib/ContractError.js`, `backend/src/index.js`, `backend/src/routes/registry.js`	`simulateAndSubmit` checks submit-queue depth and throws `QUEUE_OVERLOADED`; that code maps to 503, and the admin quota-status endpoint returns registration and queue metrics.
Per-IP registration quota `backend/src/routes/registry.js`, `backend/src/routes/registry.test.js`	Registry prepare/submit flow now tracks registrations per IP, returns 403 `QUOTA_EXCEEDED` when the cap is reached, and resets/snapshots the in-memory counters in tests.
Per-agent activity feed cap `backend/src/lib/activityFeed.js`, `backend/test/activity.test.js`	`recordActivity` trims the leading same-agent block before inserting when the configured per-agent cap is reached, with tests for seeded and persisted feeds.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Stellar-Ecosystem/lodestar#115: Shares the same express-rate-limit middleware and config surface expanded in this PR.
Stellar-Ecosystem/lodestar#123: Touches the same submit-queue path in backend/src/lib/contract.js that now enforces queue depth.
Stellar-Ecosystem/lodestar#96: Modifies backend/src/lib/ContractError.js, which now maps QUEUE_OVERLOADED to HTTP 503.

Suggested reviewers

ritik4ever

🐇 Hop-hop, the limits bloom,
One IP can't fill every room.
The queue stays light, the feed stays fair,
And quota stats are now right there.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR covers rate limits, quotas, queue depth, and feed capping, but the admin quota-status endpoint does not expose rate-limit state as requested.	Extend /api/admin/quota-status to include current per-IP rate-limit state, or document why that acceptance criterion is intentionally omitted.
Docstring Coverage	⚠️ Warning	Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is concise and clearly reflects the main change: multi-tenancy isolation controls.
Out of Scope Changes check	✅ Passed	All changes align with the isolation, quota, queue, feed, and test updates described in the linked issue.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (1)

backend/src/middleware/rateLimiter.test.js (1)
72-97: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Assert the 429 path for the new limiters too.

These cases only prove the first five requests are allowed. They do not verify that globalRateLimiter() and readRateLimiter() actually enforce their configured thresholds, so a broken implementation that never blocks would still pass. Use a small mocked max in each test and assert the next request returns 429 with RATE_LIMIT_EXCEEDED.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/middleware/rateLimiter.test.js` around lines 72 - 97, The current
`globalRateLimiter()` and `readRateLimiter()` tests only verify allowed requests
and do not cover the blocking path, so update these specs in
`rateLimiter.test.js` to use a small mocked limit and assert the request after
the threshold returns `429` with `RATE_LIMIT_EXCEEDED`. Keep the existing
`describe('globalRateLimiter')` and `describe('readRateLimiter')` structure, but
change the assertions so the limiter behavior is validated both before and after
the configured max is reached.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/config.js`:
- Around line 147-148: The registration quota is still being read directly from
the environment in the registry route, which bypasses the validated config
value. Update the quota check in registry handling (the logic that currently
uses Number(process.env.MAX_SERVICES_PER_IP) || MAX_SERVICES_PER_IP) to use
config.maxServicesPerIp as the single source of truth, and make sure the quota
enforcement path in registry.js relies on that parsed field everywhere.

In `@backend/src/index.js`:
- Around line 59-60: The current app.use("/api", readRateLimiter()) placement in
index.js applies the read limiter to all /api traffic, including POST/write
endpoints and admin routes. Update the middleware registration so
readRateLimiter is only attached to actual read routes, and keep write/admin
routes on their intended limiter paths; use the existing route setup in app.use
to scope it narrowly without changing behavior for prepare-register,
submit-signed-tx, reputation, or the admin status endpoint.
- Around line 71-75: The admin response is using the pending-queue metric for
both queue fields, so `queueDepth` never reflects the actual depth. Update the
response in the endpoint handler that builds the `res.json` payload to use
`getSubmitQueueDepth()` for `queueDepth` while leaving `queuePending` on
`getSubmitQueuePending()`, and verify the existing import in `index.js` is wired
to the correct symbol.

In `@backend/src/lib/activityFeed.js`:
- Around line 42-43: The activity feed cap is being re-read from process.env
instead of using the already validated config, which can bypass normalization
for invalid or zero values. Update the activity feed logic in activityFeed to
consume the normalized ACTIVITY_FEED_MAX_PER_AGENT value from the config module
(the same source used by config.js) rather than calling
Number(process.env.ACTIVITY_FEED_MAX_PER_AGENT || 5). Keep the fallback and
validation centralized so the max-per-agent behavior stays consistent across the
app.
- Around line 58-60: The trimming logic in activityFeed’s run-handling block
only removes a single entry at maxPerAgent - 1, so oversized persisted runs can
stay above the cap after prepending. Update the consecutive-run handling in
activityFeed.js to detect when a run for the same agent exceeds maxPerAgent and
remove the entire existing run segment before adding the new entry, rather than
deleting just one item. Use the same feed-splice logic around the
consecutive/maxPerAgent check to locate the fix.

In `@backend/src/lib/contract.js`:
- Around line 240-247: The new QUEUE_OVERLOADED ContractError from
simulateAndSubmit is still being collapsed to 400 in the write handlers, so
update backend/src/routes/registry.js to recognize that error code and return
503 instead. Adjust the shared ContractError-to-status mapping used by the
prepare-register, submit-signed-tx, reputation, and deactivation write paths so
QUEUE_OVERLOADED is routed to Service Unavailable while other unknown
ContractError codes keep their existing fallback behavior.

In `@backend/src/routes/registry.js`:
- Around line 392-393: The quota is being incremented too early in the registry
flow, which counts prepared transactions instead of completed on-chain
registrations. Move the increment out of the unsigned registration path in the
registry route and apply it only after a successful signed
submission/confirmation in the code paths around prepare/submit handling, using
the existing incrementIpRegistration helper and the registry transaction
handlers as the reference points.

In `@backend/test/activity.test.js`:
- Around line 75-98: Add a regression test for the preexisting oversized
top-block case in activity.test.js: the current coverage in the
recordActivity/getActivityFeed flow only verifies growth from a clean feed, not
persisted data that already starts with 6+ consecutive entries from one agent.
Seed the feed with an already-oversized GAURU block, then perform one additional
recordActivity write and assert getActivityFeed collapses that run back to the
configured cap using ACTIVITY_MAX_ENTRIES and the same consecutive-entry checks.

---

Nitpick comments:
In `@backend/src/middleware/rateLimiter.test.js`:
- Around line 72-97: The current `globalRateLimiter()` and `readRateLimiter()`
tests only verify allowed requests and do not cover the blocking path, so update
these specs in `rateLimiter.test.js` to use a small mocked limit and assert the
request after the threshold returns `429` with `RATE_LIMIT_EXCEEDED`. Keep the
existing `describe('globalRateLimiter')` and `describe('readRateLimiter')`
structure, but change the assertions so the limiter behavior is validated both
before and after the configured max is reached.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 10b5ea79-a170-4f85-a354-9bc6521dbff4

📥 Commits

Reviewing files that changed from the base of the PR and between 67cb9f2 and 6131030.

📒 Files selected for processing (10)

backend/src/config.js
backend/src/config.test.js
backend/src/index.js
backend/src/lib/activityFeed.js
backend/src/lib/contract.js
backend/src/middleware/rateLimiter.js
backend/src/middleware/rateLimiter.test.js
backend/src/routes/registry.js
backend/src/routes/registry.test.js
backend/test/activity.test.js

coderabbitai · 2026-06-28T18:48:46Z

+  const agent = entry.agent || 'unknown';
+  const maxPerAgent = Number(process.env.ACTIVITY_FEED_MAX_PER_AGENT || 5);


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use the validated config value here instead of re-reading process.env.

backend/src/config.js:150-155 already normalizes ACTIVITY_FEED_MAX_PER_AGENT, but this code reparses the raw env var. If the env is 0 or non-numeric, the cap silently becomes wrong here (splice(-1, 1) / disabled cap) while the rest of the app still assumes the validated config value.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/src/lib/activityFeed.js` around lines 42 - 43, The activity feed cap is being re-read from process.env instead of using the already validated config, which can bypass normalization for invalid or zero values. Update the activity feed logic in activityFeed to consume the normalized ACTIVITY_FEED_MAX_PER_AGENT value from the config module (the same source used by config.js) rather than calling Number(process.env.ACTIVITY_FEED_MAX_PER_AGENT || 5). Keep the fallback and validation centralized so the max-per-agent behavior stays consistent across the app.

- Use config.maxServicesPerIp (validated) instead of raw process.env in registry.js - Scope read rate limiter to GET requests on /api/services and /api/agents only - Fix admin endpoint: queueDepth uses getSubmitQueueDepth, queuePending uses getSubmitQueuePending - Use validated module-level MAX_PER_AGENT constant in activityFeed.js - Trim entire oversized consecutive block, not just one entry - Map QUEUE_OVERLOADED to 503 in handleContractError and all route handlers - Move registration increment from prepare-register to submit-signed-tx - Add regression test for already-oversized feed top block

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/routes/registry.js`:
- Around line 443-455: The submission flow in registry handling consumes the
prepared record too early, which breaks retries after retryable ContractError
failures. In the route that calls validatePreparedRegistrySubmission() and then
submitSignedRegistryTx(), defer deleting/consuming the prepared submission until
after the tx has been successfully accepted, or explicitly restore it when
submitSignedRegistryTx() throws QUEUE_OVERLOADED or TRANSACTION_TIMEOUT. Keep
the retryable error handling in the same catch path and ensure the submitToken
remains usable for a subsequent attempt.
- Around line 345-348: The quota check in the prepare/submit registration flow
is bypassable because `prepare-register` and `submit-signed-tx` use the live
`req.ip` separately instead of binding the originating IP to the prepared
request. Update the registry flow so `prepare-register` stores the source IP
with the prepared submission, and have `submit-signed-tx` use that stored IP
(not the current request IP) when charging the quota. Also make the reservation
or re-check atomic in the `getIpRegistrationCount`/quota enforcement path so
concurrent prepares cannot exceed `config.maxServicesPerIp`, and ensure the
`logger.warn` quota-exceeded path still references the bound IP.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3b67a30c-7fd2-4845-91d0-4ddde4c9ebac

📥 Commits

Reviewing files that changed from the base of the PR and between 6131030 and c4da15c.

📒 Files selected for processing (6)

backend/src/index.js
backend/src/lib/ContractError.js
backend/src/lib/activityFeed.js
backend/src/routes/registry.js
backend/src/routes/registry.test.js
backend/test/activity.test.js

🚧 Files skipped from review as they are similar to previous changes (3)

backend/src/lib/activityFeed.js
backend/src/routes/registry.test.js
backend/src/index.js

coderabbitai · 2026-06-28T18:59:35Z

+    const ip = req.ip;
+    const currentCount = getIpRegistrationCount(ip);
+    if (currentCount >= config.maxServicesPerIp) {
+      logger.warn({ ip, count: currentCount, max: config.maxServicesPerIp }, 'Registration quota exceeded');


🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Quota enforcement is still bypassable across the prepare/submit flow.

The cap is checked in prepare-register, but it is only charged later in submit-signed-tx using the current req.ip. That means two prepared registrations can both pass while the count is 9 and later increment to 11, and a submit from a different IP charges the wrong tenant. Bind the originating IP to the prepared submission and reserve or re-check the slot atomically when the signed transaction is accepted.

Also applies to: 443-449

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/src/routes/registry.js` around lines 345 - 348, The quota check in the prepare/submit registration flow is bypassable because `prepare-register` and `submit-signed-tx` use the live `req.ip` separately instead of binding the originating IP to the prepared request. Update the registry flow so `prepare-register` stores the source IP with the prepared submission, and have `submit-signed-tx` use that stored IP (not the current request IP) when charging the quota. Also make the reservation or re-check atomic in the `getIpRegistrationCount`/quota enforcement path so concurrent prepares cannot exceed `config.maxServicesPerIp`, and ensure the `logger.warn` quota-exceeded path still references the bound IP.

coderabbitai · 2026-06-28T18:59:35Z

+    const prepared = validatePreparedRegistrySubmission(submitToken, signedXdr);

    const result = await submitSignedRegistryTx(signedXdr);
+    // Only count registrations (not deactivations) toward per-IP quota.
+    if (prepared.action === "register") {
+      incrementIpRegistration(req.ip);
+    }
    logger.info({ hash: result.hash, id: result.id }, "Submitted wallet-signed registry tx");
    res.json({ success: true, ...result });
  } catch (err) {
    if (err instanceof ContractError) {
-      const status = err.code === "TRANSACTION_TIMEOUT" ? 504 : 400;
+      const status = err.code === "QUEUE_OVERLOADED" ? 503 : err.code === "TRANSACTION_TIMEOUT" ? 504 : 400;
      return res.status(status).json({ error: err.message, code: err.code });


🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

Don't consume the submit token before retryable failures.

validatePreparedRegistrySubmission() deletes the prepared submission before submitSignedRegistryTx() runs. If submission then fails with QUEUE_OVERLOADED or TRANSACTION_TIMEOUT, the client gets a retryable 503/504 but the same submitToken is already gone, so the retry path is broken. Consume the prepared submission only after the tx is accepted for submission, or restore it on retryable failures.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/src/routes/registry.js` around lines 443 - 455, The submission flow in registry handling consumes the prepared record too early, which breaks retries after retryable ContractError failures. In the route that calls validatePreparedRegistrySubmission() and then submitSignedRegistryTx(), defer deleting/consuming the prepared submission until after the tx has been successfully accepted, or explicitly restore it when submitSignedRegistryTx() throws QUEUE_OVERLOADED or TRANSACTION_TIMEOUT. Keep the retryable error handling in the same catch path and ensure the submitToken remains usable for a subsequent attempt.

ritik4ever · 2026-06-29T05:05:59Z

@supreme2580 please fix ci

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

		const agent = entry.agent \|\| 'unknown';
		const maxPerAgent = Number(process.env.ACTIVITY_FEED_MAX_PER_AGENT \|\| 5);

Uh oh!

Conversation

supreme2580 commented Jun 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

ritik4ever commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

supreme2580 commented Jun 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading