Skip to content

feat(ingestion): async background processing with progressive UI#7

Merged
jeremylongshore merged 3 commits intomainfrom
fix/ingestion-button-reliability
Feb 14, 2026
Merged

feat(ingestion): async background processing with progressive UI#7
jeremylongshore merged 3 commits intomainfrom
fix/ingestion-button-reliability

Conversation

@jeremylongshore
Copy link
Collaborator

@jeremylongshore jeremylongshore commented Feb 14, 2026

Summary

  • POST /trigger/ingestion now returns 202 in <1s instead of blocking 2-5 minutes inline. Background asyncio.create_task() runs the full pipeline and updates Firestore progressively through phases (loading_sources → fetching_feeds → storing_articles → upserting_authors → done).
  • GET /trigger/ingestion/{run_id} endpoint for polling run status from Firestore, with evaluate_run_success() scoring terminal runs.
  • Idempotency guard prevents concurrent runs (returns 409), auto-cleans stale runs older than 10 minutes.
  • Dashboard IngestionButton rewritten to poll every 3s with progressive phase labels, 5-minute stuck detection, and CustomEvent("ingestion-complete") for SystemActivityCard auto-refresh.
  • logging.py upgraded from Phase 4 stubs to real Firestore ingestion_runs writes with set(merge=True).
  • Cloud Run deploy gets --no-cpu-throttling and --cpu-boost flags to support background task execution.
  • Composite Firestore index added for [status ASC, startedAt DESC] active run query.
  • 27 new tests (20 unit + 7 integration) covering models, success evaluation edge cases, 202/409/404 responses.
  • Debug runbook added at 000-docs/ingestion-run-button-debug.md with failure taxonomy, diagnostic queries, and SLO targets.

Test plan

  • pytest tests/mcp/ tests/api/ -v --no-cov — 78 tests pass (27 new + 51 existing)
  • black --check on modified Python files — clean
  • tsc --noEmit on dashboard — compiles cleanly
  • Deploy MCP to Cloud Run with --no-cpu-throttling
  • curl -X POST .../trigger/ingestion returns 202 in <1s with run_id
  • curl .../trigger/ingestion/{run_id} returns progressive status
  • Firestore ingestion_runs collection has docs with status transitions
  • Dashboard: click Run Ingestion → see phase text → completion toast
  • SystemActivityCard auto-refreshes after completion
  • Click Run Ingestion while running → "already in progress" warning

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Asynchronous, multi‑phase ingestion with background execution, status endpoint, idempotency (409 when a run is active), and improved success evaluation.
  • UI
    • Phased run button with spinner, phase labels, inline run stats, polling, timeout/stuck-run detection, toasts, and auto-refresh of activity cards on completion.
  • Documentation
    • Added a detailed ingestion run debugging guide with failure taxonomy and diagnostic commands.
  • Tests
    • New unit and integration tests for trigger flows and status polling.
  • Chores
    • Added a Firestore index and a minor deployment flag update; RSS feed list updated.

…back

Replace inline HTTP-blocking ingestion with async background pipeline:
- POST /trigger/ingestion returns 202 Accepted in <1s with run_id
- GET /trigger/ingestion/{run_id} for polling run status from Firestore
- Background task updates Firestore doc through phases progressively
- Idempotency guard prevents concurrent runs (409), auto-cleans stale >10min
- Dashboard polls every 3s with phase labels, stuck detection, auto-refresh
- logging.py upgraded from Phase 4 stubs to real Firestore writes
- Cloud Run deploy gets --no-cpu-throttling and --cpu-boost flags
- Composite Firestore index for active run query
- 27 new tests (unit + integration), debug runbook in 000-docs/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 14, 2026

📝 Walkthrough

Walkthrough

Adds an asynchronous ingestion trigger: POST returns 202 with a run_id and poll_url, a background multi‑phase ingestion pipeline updates run state in Firestore, GET /ingestion/{run_id} reports phase/stats/success, frontend polls and reacts to completion; logging, indexes, tests, docs, and a new RSS feed were also added.

Changes

Cohort / File(s) Summary
Deployment Configuration
\.github/workflows/deploy-mcp.yml
Added Cloud Run flags --no-cpu-throttling and --cpu-boost to the deploy step.
Documentation / Runbook
000-docs/ingestion-run-button-debug.md
New debugging runbook describing ingestion flow, failure taxonomy (F1–F7), debug commands, Cloud Logging/Firestore queries, SLOs, and architecture notes.
Frontend Components
dashboard/src/components/IngestionButton.tsx, dashboard/src/components/SystemActivityCard.tsx
IngestionButton: phased run state, run polling, timeout/stopPolling, toasts, emits ingestion-complete; SystemActivityCard: refactored fetchRuns (useCallback) and listens for ingestion-complete to auto-refresh.
Firestore Indexes
firestore.indexes.json
Added composite index for collectionGroup ingestion_runs on status (ASC) and startedAt (DESC).
Service Discovery / Root
perception_app/mcp_service/main.py
Service root now advertises /trigger/ingestion/{run_id} in addition to existing trigger entries.
Logging Router
perception_app/mcp_service/routers/logging.py
Replaced placeholder responses with Firestore writes; added IngestionStats fields (duplicates_skipped, brief_generated, errors), required started_at, optional completed_at, computes duration, sets updatedAt SERVER_TIMESTAMP, returns firestore path; includes Firestore client helper.
Trigger Router (Backend)
perception_app/mcp_service/routers/trigger.py
Major async refactor: POST /ingestion → 202 TriggerAcceptedResponse with run_id/poll_url; idempotency check returning 409 if active; background run_ingestion_pipeline updates phases/stats in Firestore; GET /ingestion/{run_id} returns IngestionRunStatus; helpers added (evaluate_run_success, check_active_run, _get_db, load_sources, fetch_single_feed).
Integration Tests
tests/api/test_trigger_integration.py
New HTTP-level tests with mocked Firestore covering POST 202/409/validation and GET running/completed/404 scenarios, including success evaluation.
Unit Tests
tests/mcp/test_trigger_router.py
New unit tests for models and helpers: TriggerIngestionRequest defaults/validation, TriggerAcceptedResponse, IngestionRunStatus, evaluate_run_success scenarios, and load_sources behavior.
RSS Sources Data
perception_app/mcp_service/config/rss_sources.yaml
Added LangChain Blog feed (https://blog.langchain.com/rss/, category ai, active true); updated totals and AI/ML RESEARCH count (128→129, 3→4).
Misc / Manifest
manifest_file, package.json
Small manifest/package updates referenced in diff metadata (no public API changes).

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend as IngestionButton
    participant Backend as MCP_Service
    participant Firestore as Firestore_DB
    participant Pipeline as Ingestion_Pipeline

    User->>Frontend: Click "Run Ingestion"
    Frontend->>Backend: POST /trigger/ingestion
    Backend->>Firestore: check_active_run()
    Firestore-->>Backend: active run? (no/yes)
    alt no active run
        Backend->>Firestore: create ingestion_runs/{run_id} (status=accepted)
        Backend->>Pipeline: start async run_ingestion_pipeline(run_id)
        Backend-->>Frontend: 202 Accepted + run_id + poll_url
        Frontend->>Frontend: start polling GET /ingestion/{run_id}
        Pipeline->>Firestore: update phases and stats (loading_sources → fetching_feeds → storing_articles → done)
        Frontend->>Backend: GET /ingestion/{run_id}
        Backend->>Firestore: read ingestion_runs/{run_id}
        Firestore-->>Backend: current run state
        Backend-->>Frontend: IngestionRunStatus (is_successful computed)
        Frontend->>Frontend: stop polling, handleComplete(), emit ingestion-complete
        Frontend->>Frontend: SystemActivityCard receives ingestion-complete -> refresh runs
    else active run
        Backend-->>Frontend: 409 Conflict (active_run_id)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I nudged the trigger, sent a ping so bright,

Phases hopped along from morning into night,
Firestore tended each stat and start,
Polling danced until the final part,
Rabbit cheers: "Async runs — what a delight!"

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: converting ingestion to asynchronous background processing with progressive UI updates, which is the main objective of this comprehensive PR.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering summary, test plan, and implementation details, though it deviates from the repository's template which requires specific sections (What's this PR do, Why, Testing checklist, Anything else).
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/ingestion-button-reliability

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @jeremylongshore, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the ingestion pipeline by introducing asynchronous background processing, providing a more responsive user experience. It integrates real-time status updates via Firestore, improves system reliability with idempotency checks and stale run cleanup, and adds comprehensive documentation for debugging. These changes ensure that ingestion operations are non-blocking, transparent, and easier to monitor and troubleshoot.

Highlights

  • Asynchronous Ingestion Processing: The /trigger/ingestion endpoint now returns a 202 Accepted response within 1 second, offloading the full ingestion pipeline to a background asyncio.create_task().
  • Progressive UI Updates: The dashboard's IngestionButton has been rewritten to poll the new /trigger/ingestion/{run_id} endpoint every 3 seconds, displaying progressive phase labels (e.g., 'Loading sources...', 'Fetching feeds...') and real-time statistics.
  • Firestore Integration for Run Status: Ingestion run statuses, phases, and statistics are now progressively updated and stored in a new ingestion_runs Firestore collection, replacing the previous stubbed logging. A new composite index was added to support efficient querying.
  • Idempotency and Stale Run Cleanup: An idempotency guard prevents concurrent ingestion runs, returning a 409 Conflict if one is already active. It also automatically cleans up stale runs older than 10 minutes by marking them as failed.
  • Enhanced Observability and Debugging: A new debug runbook (000-docs/ingestion-run-button-debug.md) has been added, detailing the system map, failure taxonomy, diagnostic queries, and SLO targets for the ingestion process.
  • Cloud Run Optimizations: Cloud Run deployments now utilize --no-cpu-throttling and --cpu-boost flags to better support background task execution and reduce cold start times.
  • Comprehensive Testing: 27 new tests (20 unit + 7 integration) were added to cover models, success evaluation edge cases, and the new 202/409/404 API responses.
Changelog
  • 000-docs/ingestion-run-button-debug.md
    • Added a new debug runbook for the ingestion button, including system map, failure taxonomy, diagnostic queries, SLO targets, and architecture notes.
  • dashboard/src/components/IngestionButton.tsx
    • Refactored state management to track ingestion phases and statistics instead of a simple boolean.
    • Implemented polling logic to fetch ingestion run status from a new API endpoint every 3 seconds.
    • Added progressive UI updates displaying current ingestion phase labels and real-time article fetch/store counts.
    • Introduced a 5-minute stuck detection mechanism with a warning toast.
    • Modified the ingestion trigger to handle 202 Accepted for background processing and 409 Conflict for concurrent runs.
    • Dispatched a custom 'ingestion-complete' event upon run completion to trigger dashboard refreshes.
  • dashboard/src/components/SystemActivityCard.tsx
    • Refactored the run fetching logic into a dedicated function.
    • Added an event listener to automatically refresh the activity list upon receiving an 'ingestion-complete' custom event.
  • firestore.indexes.json
    • Added a new composite index for the ingestion_runs collection on status (ascending) and startedAt (descending) to optimize active run queries.
  • perception_app/mcp_service/main.py
    • Updated the list of trigger endpoints to include the new GET /trigger/ingestion/{run_id} route.
  • perception_app/mcp_service/routers/logging.py
    • Replaced stubbed logging with actual Firestore writes to the ingestion_runs collection.
    • Introduced a lazy-initialized Firestore client.
    • Updated LogIngestionRunRequest and LogIngestionRunResponse models to reflect real data structures.
    • Implemented logic to create or update Firestore documents with merge semantics, including calculated duration and detailed statistics.
  • perception_app/mcp_service/routers/trigger.py
    • Introduced a lazy-initialized Firestore client.
    • Defined new Pydantic models TriggerAcceptedResponse and IngestionRunStatus for API responses.
    • Added evaluate_run_success helper to determine run success based on articles stored, error rate, and duration.
    • Implemented check_active_run for idempotency, which detects and cleans up stale active runs.
    • Refactored the core ingestion logic into an asynchronous background task run_ingestion_pipeline.
    • Modified POST /ingestion to return 202 Accepted immediately and launch the pipeline as a background task.
    • Added GET /ingestion/{run_id} endpoint to retrieve the current status of an ingestion run from Firestore.
    • Updated the background pipeline to progressively update Firestore with phase changes, statistics, and errors.
  • tests/api/test_trigger_integration.py
    • Added new integration tests for the /trigger/ingestion and /trigger/ingestion/{run_id} endpoints.
    • Mocked Firestore interactions to verify HTTP responses for starting, polling, and handling concurrent/non-existent runs.
  • tests/mcp/test_trigger_router.py
    • Added new unit tests for Pydantic models (TriggerIngestionRequest, TriggerAcceptedResponse, IngestionRunStatus).
    • Tested the evaluate_run_success helper function across various success and failure conditions.
    • Included tests for the load_sources helper and storage idempotency (URL hashing).
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/deploy-mcp.yml
Activity
  • The core ingestion process was refactored to run asynchronously in the background.
  • New API endpoints were introduced for triggering ingestion and polling its status.
  • Firestore integration was implemented to persist and track ingestion run details.
  • The frontend UI was updated to provide real-time feedback on ingestion progress.
  • A new debug runbook was created to assist with troubleshooting ingestion issues.
  • Cloud Run deployment configurations were adjusted for optimal background task execution.
  • A total of 27 new tests were added to ensure the reliability and correctness of the new features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant improvement, introducing asynchronous background processing for the ingestion pipeline. This greatly enhances user experience by providing immediate feedback and progressive updates. The backend changes, including the idempotency guard and stale run cleanup, add crucial reliability. My review highlights a few critical and high-severity issues: an architectural concern with duplicated database logic in the backend, a potential memory leak on the frontend, and a bug in the stale run cleanup logic. Addressing these will make the implementation more robust and maintainable.


const pollStatus = useCallback(
(runId: string) => {
pollRef.current = setInterval(async () => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The setInterval created for polling is not cleared if the component unmounts while a poll is in progress (e.g., if the user navigates away). This will cause a memory leak and may lead to React attempting to update state on an unmounted component. You should use a useEffect hook to call stopPolling during cleanup.

useEffect(() => {
  return () => {
    stopPolling();
  };
}, [stopPolling]);

This should be added inside the IngestionButton component.

Comment on lines +228 to 237
async def run_ingestion_pipeline(
run_id: str,
request: TriggerIngestionRequest,
db: firestore.Client,
) -> None:
"""Run the full ingestion pipeline as a background task.

Updates Firestore doc progressively through phases:
loading_sources → fetching_feeds → storing_articles → upserting_authors → done
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This run_ingestion_pipeline function implements its own logic for writing to Firestore, and the _get_db helper is duplicated from routers/logging.py. However, a dedicated /mcp/tools/log_ingestion_run tool exists for this purpose. To maintain a clear separation of concerns and adhere to the MCP architecture where orchestrators use tools, this trigger router should call the logging tool instead of directly accessing the database. This will centralize the database logic, improve modularity, and reduce code duplication.

Comment on lines +123 to +126
if hasattr(started_at, "timestamp"):
age_seconds = time.time() - started_at.timestamp()
else:
age_seconds = 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The else: age_seconds = 0 branch is a potential bug. If started_at is not a valid datetime object for some reason (e.g., data corruption), this code will silently treat the run as brand new, preventing it from ever being cleaned up as stale. This could block all future ingestion runs. It would be more robust to log an error and treat the run as stale to avoid a deadlock.

Suggested change
if hasattr(started_at, "timestamp"):
age_seconds = time.time() - started_at.timestamp()
else:
age_seconds = 0
if hasattr(started_at, "timestamp"):
age_seconds = time.time() - started_at.timestamp()
else:
logger.error(f"Invalid startedAt type for run {doc.id}: {type(started_at)}")
age_seconds = 601 # Mark as stale to be safe

- **Cause**: POST failed (network, CORS, Cloud Run cold start timeout)
- **Debug**:
```bash
curl -X POST https://perception-mcp-w53xszfqnq-uc.a.run.app/trigger/ingestion \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The curl commands in this document use a hardcoded URL (https://perception-mcp-w53xszfqnq-uc.a.run.app). It's a better practice to use a placeholder like <SERVICE_URL> to make the documentation more generic and reusable across different environments (e.g., staging, local development).

Suggested change
curl -X POST https://perception-mcp-w53xszfqnq-uc.a.run.app/trigger/ingestion \
curl -X POST <SERVICE_URL>/trigger/ingestion \


const pollStatus = useCallback(
(runId: string) => {
pollRef.current = setInterval(async () => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using setInterval for polling can be problematic if a network request takes longer than the interval, as new requests can stack up. A more robust pattern is to use a recursive setTimeout, which ensures that the next poll is only scheduled after the current one has completed or failed.

Example:

const poll = async () => {
  // ... fetch logic ...
  
  // If not in a terminal state, schedule the next poll
  if (shouldContinuePolling) {
    setTimeout(poll, POLL_INTERVAL_MS);
  }
};

poll();

Comment on lines +60 to +89
const fetchRuns = async () => {
try {
const runsRef = collection(db, 'ingestion_runs')
const q = query(runsRef, orderBy('startedAt', 'desc'), limit(10))
const snapshot = await getDocs(q)

const runsList = snapshot.docs.map((doc) => ({
id: doc.id,
...doc.data()
})) as IngestionRun[]

setRuns(runsList)
} catch (err) {
console.error('Error fetching ingestion runs:', err)
setError(err instanceof Error ? err.message : 'Failed to load activity')
} finally {
setLoading(false)
}
}

useEffect(() => {
fetchRuns()
}, [])

// Auto-refresh when ingestion completes
useEffect(() => {
const handler = () => fetchRuns()
window.addEventListener('ingestion-complete', handler)
return () => window.removeEventListener('ingestion-complete', handler)
}, [])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fetchRuns function is redefined on every render. This can lead to stale closures in the useEffect that listens for the ingestion-complete event and can cause unnecessary re-executions of effects. To ensure correctness and improve performance, you should wrap fetchRuns in useCallback and add it as a dependency to the useEffect hooks that use it.

  const fetchRuns = useCallback(async () => {
    try {
      const runsRef = collection(db, 'ingestion_runs')
      const q = query(runsRef, orderBy('startedAt', 'desc'), limit(10))
      const snapshot = await getDocs(q)

      const runsList = snapshot.docs.map((doc) => ({
        id: doc.id,
        ...doc.data()
      })) as IngestionRun[]

      setRuns(runsList)
    } catch (err) {
      console.error('Error fetching ingestion runs:', err)
      setError(err instanceof Error ? err.message : 'Failed to load activity')
    } finally {
      setLoading(false)
    }
  }, [])

  useEffect(() => {
    fetchRuns()
  }, [fetchRuns])

  // Auto-refresh when ingestion completes
  useEffect(() => {
    const handler = () => fetchRuns()
    window.addEventListener('ingestion-complete', handler)
    return () => window.removeEventListener('ingestion-complete', handler)
  }, [fetchRuns])

Comment on lines +225 to +245
class TestStorageIdempotency:
"""Tests for deterministic document IDs."""

def test_same_url_same_hash(self):
"""Same URL should produce same hash for deduplication."""
import hashlib

url = "https://example.com/article/123"
hash1 = hashlib.sha256(url.encode()).hexdigest()[:16]
hash2 = hashlib.sha256(url.encode()).hexdigest()[:16]
assert hash1 == hash2

def test_different_urls_different_hashes(self):
"""Different URLs should produce different hashes."""
import hashlib

url1 = "https://example.com/article/123"
url2 = "https://example.com/article/456"
hash1 = hashlib.sha256(url1.encode()).hexdigest()[:16]
hash2 = hashlib.sha256(url2.encode()).hexdigest()[:16]
assert hash1 != hash2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The TestStorageIdempotency class appears to be testing the behavior of Python's standard hashlib library, rather than any application code within the trigger router. Tests should focus on your application's logic. This test class doesn't add value and should be removed to reduce noise in the test suite.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@000-docs/ingestion-run-button-debug.md`:
- Around line 5-20: The fenced code block in the markdown snippet (the
triple-backtick block showing the Ingestion flow) is missing a language tag
which triggers MD040; update the opening fence from ``` to ```text so the block
becomes a fenced "text" code block, ensuring tools/markdownlint no longer flag
it and preserving the exact content inside the block (refer to the fenced block
that contains "Dashboard (IngestionButton.tsx)" and the POST/Poll/On completion
lines).

In `@dashboard/src/components/IngestionButton.tsx`:
- Around line 1-41: The component IngestionButton currently sets up polling
using pollRef but never clears it on unmount; add a useEffect in the
IngestionButton component that returns a cleanup function which calls
stopPolling() (and optionally clears startTimeRef) to ensure clearInterval is
called and pollRef.current is nulled when the component unmounts, referencing
pollRef and the existing stopPolling callback so the interval cannot fire after
unmount.

In `@dashboard/src/components/SystemActivityCard.tsx`:
- Around line 60-77: The fetchRuns function doesn't clear the previous error
when a refresh succeeds, so after a successful fetch the component may still
show the error state; update fetchRuns to reset the error on success by calling
setError(null) (or empty string depending on your error state type) immediately
after setRuns (or before setLoading(false)), ensuring the state variable
referenced by error is cleared on successful fetch.

In `@tests/api/test_trigger_integration.py`:
- Around line 43-52: The fixture client currently yields TestClient(app)
directly which skips FastAPI lifespan/startup/shutdown events; update the pytest
fixture (the client fixture that calls _mock_firestore and patches
routers.trigger._get_db) to instantiate TestClient(app) inside a with context
(e.g., with TestClient(app) as client:) and yield that client so the context
manager runs lifespan cleanup; apply the same change to the other similar
fixture/blocks around lines 54-78 that create TestClient without a with
statement.
🧹 Nitpick comments (2)
dashboard/src/components/IngestionButton.tsx (1)

7-25: Align RunPhase with backend phases.

PHASE_LABELS includes accepted and initializing, and the backend can emit those values; the union should include them for type safety.

♻️ Suggested fix
 type RunPhase =
   | 'idle'
   | 'starting'
+  | 'accepted'
+  | 'initializing'
   | 'loading_sources'
   | 'fetching_feeds'
   | 'storing_articles'
   | 'upserting_authors'
   | 'done'
perception_app/mcp_service/routers/logging.py (1)

81-108: Use firestore.AsyncClient to avoid blocking the async event loop.

The handler is async (line 63) but uses the synchronous firestore.Client for I/O at line 107 (doc_ref.set()), which blocks the event loop under load. Replace with firestore.AsyncClient and await the Firestore operations, or use await run_in_threadpool(doc_ref.set, ...) if the client must remain synchronous.

jeremylongshore and others added 2 commits February 14, 2026 17:24
- Fix memory leak: add useEffect cleanup to stop polling on unmount
- Fix stale run deadlock: invalid startedAt now defaults to stale (601s)
  instead of age_seconds=0 which blocked all future runs
- Switch from setInterval to recursive setTimeout to prevent stacked polls
- Align RunPhase type with backend phases (add accepted, initializing)
- Wrap fetchRuns in useCallback, add fetchRuns to useEffect deps
- Clear error state on successful refresh in SystemActivityCard
- Use TestClient context manager for proper lifespan cleanup in tests
- Remove TestStorageIdempotency (tested stdlib hashlib, not app code)
- Add text language tag to fenced code block (MD040)
- Replace hardcoded service URL with <SERVICE_URL> placeholder in runbook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@perception_app/mcp_service/config/rss_sources.yaml`:
- Around line 148-152: Remove the duplicate RSS entry that points to
"https://blog.langchain.dev/rss/" (the older redirect) and keep the existing
"LangChain Blog" entry that uses "https://blog.langchain.com/rss/"; locate the
duplicate by matching the name or URL ("LangChain Blog" /
"https://blog.langchain.dev/rss/") in the rss_sources.yaml and delete that block
so only the canonical "https://blog.langchain.com/rss/" entry remains active.

Comment on lines +148 to +152
- name: "LangChain Blog"
url: "https://blog.langchain.com/rss/"
category: "ai"
active: true

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if both LangChain RSS URLs are accessible and compare their content

echo "=== Checking existing LangChain feed (blog.langchain.dev) ==="
curl -sI "https://blog.langchain.dev/rss/" | head -10

echo ""
echo "=== Checking new LangChain Blog feed (blog.langchain.com) ==="
curl -sI "https://blog.langchain.com/rss/" | head -10

echo ""
echo "=== Comparing first few article titles from each feed ==="
echo "--- blog.langchain.dev ---"
curl -s "https://blog.langchain.dev/rss/" 2>/dev/null | grep -oP '(?<=<title>).*?(?=</title>)' | head -5

echo ""
echo "--- blog.langchain.com ---"
curl -s "https://blog.langchain.com/rss/" 2>/dev/null | grep -oP '(?<=<title>).*?(?=</title>)' | head -5

Repository: intent-solutions-io/perception-with-intent

Length of output: 1114


🏁 Script executed:

#!/bin/bash

# Check if the YAML file exists and verify lines 502-505
echo "=== Checking YAML file for lines 502-505 ==="
if [ -f "perception_app/mcp_service/config/rss_sources.yaml" ]; then
    sed -n '500,510p' perception_app/mcp_service/config/rss_sources.yaml
else
    echo "File not found at expected path"
fi

echo ""
echo "=== Following redirect from blog.langchain.dev/rss/ ==="
curl -sL -I "https://blog.langchain.dev/rss/" 2>/dev/null | head -15

echo ""
echo "=== Getting actual content from blog.langchain.dev following redirects ==="
curl -sL "https://blog.langchain.dev/rss/" 2>/dev/null | grep -oP '(?<=<title>).*?(?=</title>)' | head -5

Repository: intent-solutions-io/perception-with-intent

Length of output: 1157


Remove the duplicate "LangChain" entry at lines 502-505.

The entry at lines 148-152 (https://blog.langchain.com/rss/) and the entry at lines 502-505 (https://blog.langchain.dev/rss/) serve identical RSS content. The blog.langchain.dev URL redirects to blog.langchain.com, making it an outdated reference to the same feed. Keep the new entry at lines 148-152 and remove the older one at lines 502-505 to prevent duplicate article fetches.

🤖 Prompt for AI Agents
In `@perception_app/mcp_service/config/rss_sources.yaml` around lines 148 - 152,
Remove the duplicate RSS entry that points to "https://blog.langchain.dev/rss/"
(the older redirect) and keep the existing "LangChain Blog" entry that uses
"https://blog.langchain.com/rss/"; locate the duplicate by matching the name or
URL ("LangChain Blog" / "https://blog.langchain.dev/rss/") in the
rss_sources.yaml and delete that block so only the canonical
"https://blog.langchain.com/rss/" entry remains active.

@jeremylongshore jeremylongshore merged commit 0bd7cb6 into main Feb 14, 2026
12 checks passed
@jeremylongshore jeremylongshore deleted the fix/ingestion-button-reliability branch February 14, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant