feat(ingestion): async background processing with progressive UI by jeremylongshore · Pull Request #7 · intent-solutions-io/perception-with-intent

jeremylongshore · 2026-02-14T04:43:35Z

Summary

POST /trigger/ingestion now returns 202 in <1s instead of blocking 2-5 minutes inline. Background asyncio.create_task() runs the full pipeline and updates Firestore progressively through phases (loading_sources → fetching_feeds → storing_articles → upserting_authors → done).
GET /trigger/ingestion/{run_id} endpoint for polling run status from Firestore, with evaluate_run_success() scoring terminal runs.
Idempotency guard prevents concurrent runs (returns 409), auto-cleans stale runs older than 10 minutes.
Dashboard IngestionButton rewritten to poll every 3s with progressive phase labels, 5-minute stuck detection, and CustomEvent("ingestion-complete") for SystemActivityCard auto-refresh.
logging.py upgraded from Phase 4 stubs to real Firestore ingestion_runs writes with set(merge=True).
Cloud Run deploy gets --no-cpu-throttling and --cpu-boost flags to support background task execution.
Composite Firestore index added for [status ASC, startedAt DESC] active run query.
27 new tests (20 unit + 7 integration) covering models, success evaluation edge cases, 202/409/404 responses.
Debug runbook added at 000-docs/ingestion-run-button-debug.md with failure taxonomy, diagnostic queries, and SLO targets.

Test plan

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Asynchronous, multi‑phase ingestion with background execution, status endpoint, idempotency (409 when a run is active), and improved success evaluation.
UI
- Phased run button with spinner, phase labels, inline run stats, polling, timeout/stuck-run detection, toasts, and auto-refresh of activity cards on completion.
Documentation
- Added a detailed ingestion run debugging guide with failure taxonomy and diagnostic commands.
Tests
- New unit and integration tests for trigger flows and status polling.
Chores
- Added a Firestore index and a minor deployment flag update; RSS feed list updated.

…back Replace inline HTTP-blocking ingestion with async background pipeline: - POST /trigger/ingestion returns 202 Accepted in <1s with run_id - GET /trigger/ingestion/{run_id} for polling run status from Firestore - Background task updates Firestore doc through phases progressively - Idempotency guard prevents concurrent runs (409), auto-cleans stale >10min - Dashboard polls every 3s with phase labels, stuck detection, auto-refresh - logging.py upgraded from Phase 4 stubs to real Firestore writes - Cloud Run deploy gets --no-cpu-throttling and --cpu-boost flags - Composite Firestore index for active run query - 27 new tests (unit + integration), debug runbook in 000-docs/ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-14T04:43:49Z

📝 Walkthrough

Walkthrough

Adds an asynchronous ingestion trigger: POST returns 202 with a run_id and poll_url, a background multi‑phase ingestion pipeline updates run state in Firestore, GET /ingestion/{run_id} reports phase/stats/success, frontend polls and reacts to completion; logging, indexes, tests, docs, and a new RSS feed were also added.

Changes

Cohort / File(s)	Summary
Deployment Configuration `\.github/workflows/deploy-mcp.yml`	Added Cloud Run flags `--no-cpu-throttling` and `--cpu-boost` to the deploy step.
Documentation / Runbook `000-docs/ingestion-run-button-debug.md`	New debugging runbook describing ingestion flow, failure taxonomy (F1–F7), debug commands, Cloud Logging/Firestore queries, SLOs, and architecture notes.
Frontend Components `dashboard/src/components/IngestionButton.tsx`, `dashboard/src/components/SystemActivityCard.tsx`	IngestionButton: phased run state, run polling, timeout/stopPolling, toasts, emits `ingestion-complete`; SystemActivityCard: refactored fetchRuns (useCallback) and listens for `ingestion-complete` to auto-refresh.
Firestore Indexes `firestore.indexes.json`	Added composite index for collectionGroup `ingestion_runs` on `status` (ASC) and `startedAt` (DESC).
Service Discovery / Root `perception_app/mcp_service/main.py`	Service root now advertises `/trigger/ingestion/{run_id}` in addition to existing trigger entries.
Logging Router `perception_app/mcp_service/routers/logging.py`	Replaced placeholder responses with Firestore writes; added `IngestionStats` fields (`duplicates_skipped`, `brief_generated`, `errors`), required `started_at`, optional `completed_at`, computes duration, sets `updatedAt` SERVER_TIMESTAMP, returns firestore path; includes Firestore client helper.
Trigger Router (Backend) `perception_app/mcp_service/routers/trigger.py`	Major async refactor: POST /ingestion → 202 `TriggerAcceptedResponse` with `run_id`/`poll_url`; idempotency check returning 409 if active; background `run_ingestion_pipeline` updates phases/stats in Firestore; GET /ingestion/{run_id} returns `IngestionRunStatus`; helpers added (evaluate_run_success, check_active_run, _get_db, load_sources, fetch_single_feed).
Integration Tests `tests/api/test_trigger_integration.py`	New HTTP-level tests with mocked Firestore covering POST 202/409/validation and GET running/completed/404 scenarios, including success evaluation.
Unit Tests `tests/mcp/test_trigger_router.py`	New unit tests for models and helpers: TriggerIngestionRequest defaults/validation, TriggerAcceptedResponse, IngestionRunStatus, evaluate_run_success scenarios, and load_sources behavior.
RSS Sources Data `perception_app/mcp_service/config/rss_sources.yaml`	Added `LangChain Blog` feed (`https://blog.langchain.com/rss/`, category `ai`, active true); updated totals and AI/ML RESEARCH count (128→129, 3→4).
Misc / Manifest `manifest_file`, `package.json`	Small manifest/package updates referenced in diff metadata (no public API changes).

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend as IngestionButton
    participant Backend as MCP_Service
    participant Firestore as Firestore_DB
    participant Pipeline as Ingestion_Pipeline

    User->>Frontend: Click "Run Ingestion"
    Frontend->>Backend: POST /trigger/ingestion
    Backend->>Firestore: check_active_run()
    Firestore-->>Backend: active run? (no/yes)
    alt no active run
        Backend->>Firestore: create ingestion_runs/{run_id} (status=accepted)
        Backend->>Pipeline: start async run_ingestion_pipeline(run_id)
        Backend-->>Frontend: 202 Accepted + run_id + poll_url
        Frontend->>Frontend: start polling GET /ingestion/{run_id}
        Pipeline->>Firestore: update phases and stats (loading_sources → fetching_feeds → storing_articles → done)
        Frontend->>Backend: GET /ingestion/{run_id}
        Backend->>Firestore: read ingestion_runs/{run_id}
        Firestore-->>Backend: current run state
        Backend-->>Frontend: IngestionRunStatus (is_successful computed)
        Frontend->>Frontend: stop polling, handleComplete(), emit ingestion-complete
        Frontend->>Frontend: SystemActivityCard receives ingestion-complete -> refresh runs
    else active run
        Backend-->>Frontend: 409 Conflict (active_run_id)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I nudged the trigger, sent a ping so bright,

Phases hopped along from morning into night,
Firestore tended each stat and start,
Polling danced until the final part,
Rabbit cheers: "Async runs — what a delight!"

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the primary change: converting ingestion to asynchronous background processing with progressive UI updates, which is the main objective of this comprehensive PR.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering summary, test plan, and implementation details, though it deviates from the repository's template which requires specific sections (What's this PR do, Why, Testing checklist, Anything else).
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/ingestion-button-reliability

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-14T04:43:58Z

Summary of Changes

Hello @jeremylongshore, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the ingestion pipeline by introducing asynchronous background processing, providing a more responsive user experience. It integrates real-time status updates via Firestore, improves system reliability with idempotency checks and stale run cleanup, and adds comprehensive documentation for debugging. These changes ensure that ingestion operations are non-blocking, transparent, and easier to monitor and troubleshoot.

Highlights

Asynchronous Ingestion Processing: The /trigger/ingestion endpoint now returns a 202 Accepted response within 1 second, offloading the full ingestion pipeline to a background asyncio.create_task().
Progressive UI Updates: The dashboard's IngestionButton has been rewritten to poll the new /trigger/ingestion/{run_id} endpoint every 3 seconds, displaying progressive phase labels (e.g., 'Loading sources...', 'Fetching feeds...') and real-time statistics.
Firestore Integration for Run Status: Ingestion run statuses, phases, and statistics are now progressively updated and stored in a new ingestion_runs Firestore collection, replacing the previous stubbed logging. A new composite index was added to support efficient querying.
Idempotency and Stale Run Cleanup: An idempotency guard prevents concurrent ingestion runs, returning a 409 Conflict if one is already active. It also automatically cleans up stale runs older than 10 minutes by marking them as failed.
Enhanced Observability and Debugging: A new debug runbook (000-docs/ingestion-run-button-debug.md) has been added, detailing the system map, failure taxonomy, diagnostic queries, and SLO targets for the ingestion process.
Cloud Run Optimizations: Cloud Run deployments now utilize --no-cpu-throttling and --cpu-boost flags to better support background task execution and reduce cold start times.
Comprehensive Testing: 27 new tests (20 unit + 7 integration) were added to cover models, success evaluation edge cases, and the new 202/409/404 API responses.

Changelog

000-docs/ingestion-run-button-debug.md
- Added a new debug runbook for the ingestion button, including system map, failure taxonomy, diagnostic queries, SLO targets, and architecture notes.
dashboard/src/components/IngestionButton.tsx
- Refactored state management to track ingestion phases and statistics instead of a simple boolean.
- Implemented polling logic to fetch ingestion run status from a new API endpoint every 3 seconds.
- Added progressive UI updates displaying current ingestion phase labels and real-time article fetch/store counts.
- Introduced a 5-minute stuck detection mechanism with a warning toast.
- Modified the ingestion trigger to handle 202 Accepted for background processing and 409 Conflict for concurrent runs.
- Dispatched a custom 'ingestion-complete' event upon run completion to trigger dashboard refreshes.
dashboard/src/components/SystemActivityCard.tsx
- Refactored the run fetching logic into a dedicated function.
- Added an event listener to automatically refresh the activity list upon receiving an 'ingestion-complete' custom event.
firestore.indexes.json
- Added a new composite index for the ingestion_runs collection on status (ascending) and startedAt (descending) to optimize active run queries.
perception_app/mcp_service/main.py
- Updated the list of trigger endpoints to include the new GET /trigger/ingestion/{run_id} route.
perception_app/mcp_service/routers/logging.py
- Replaced stubbed logging with actual Firestore writes to the ingestion_runs collection.
- Introduced a lazy-initialized Firestore client.
- Updated LogIngestionRunRequest and LogIngestionRunResponse models to reflect real data structures.
- Implemented logic to create or update Firestore documents with merge semantics, including calculated duration and detailed statistics.
perception_app/mcp_service/routers/trigger.py
- Introduced a lazy-initialized Firestore client.
- Defined new Pydantic models TriggerAcceptedResponse and IngestionRunStatus for API responses.
- Added evaluate_run_success helper to determine run success based on articles stored, error rate, and duration.
- Implemented check_active_run for idempotency, which detects and cleans up stale active runs.
- Refactored the core ingestion logic into an asynchronous background task run_ingestion_pipeline.
- Modified POST /ingestion to return 202 Accepted immediately and launch the pipeline as a background task.
- Added GET /ingestion/{run_id} endpoint to retrieve the current status of an ingestion run from Firestore.
- Updated the background pipeline to progressively update Firestore with phase changes, statistics, and errors.
tests/api/test_trigger_integration.py
- Added new integration tests for the /trigger/ingestion and /trigger/ingestion/{run_id} endpoints.
- Mocked Firestore interactions to verify HTTP responses for starting, polling, and handling concurrent/non-existent runs.
tests/mcp/test_trigger_router.py
- Added new unit tests for Pydantic models (TriggerIngestionRequest, TriggerAcceptedResponse, IngestionRunStatus).
- Tested the evaluate_run_success helper function across various success and failure conditions.
- Included tests for the load_sources helper and storage idempotency (URL hashing).

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/deploy-mcp.yml

Activity

The core ingestion process was refactored to run asynchronously in the background.
New API endpoints were introduced for triggering ingestion and polling its status.
Firestore integration was implemented to persist and track ingestion run details.
The frontend UI was updated to provide real-time feedback on ingestion progress.
A new debug runbook was created to assist with troubleshooting ingestion issues.
Cloud Run deployment configurations were adjusted for optimal background task execution.
A total of 27 new tests were added to ensure the reliability and correctness of the new features.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a significant improvement, introducing asynchronous background processing for the ingestion pipeline. This greatly enhances user experience by providing immediate feedback and progressive updates. The backend changes, including the idempotency guard and stale run cleanup, add crucial reliability. My review highlights a few critical and high-severity issues: an architectural concern with duplicated database logic in the backend, a potential memory leak on the frontend, and a bug in the stale run cleanup logic. Addressing these will make the implementation more robust and maintainable.

gemini-code-assist · 2026-02-14T04:46:36Z

dashboard/src/components/IngestionButton.tsx

+
+  const pollStatus = useCallback(
+    (runId: string) => {
+      pollRef.current = setInterval(async () => {


The setInterval created for polling is not cleared if the component unmounts while a poll is in progress (e.g., if the user navigates away). This will cause a memory leak and may lead to React attempting to update state on an unmounted component. You should use a useEffect hook to call stopPolling during cleanup.

useEffect(() => { return () => { stopPolling(); }; }, [stopPolling]);

This should be added inside the IngestionButton component.

gemini-code-assist · 2026-02-14T04:46:36Z

perception_app/mcp_service/routers/trigger.py

+async def run_ingestion_pipeline(
+    run_id: str,
+    request: TriggerIngestionRequest,
+    db: firestore.Client,
+) -> None:
+    """Run the full ingestion pipeline as a background task.
+
+    Updates Firestore doc progressively through phases:
+    loading_sources → fetching_feeds → storing_articles → upserting_authors → done
    """


This run_ingestion_pipeline function implements its own logic for writing to Firestore, and the _get_db helper is duplicated from routers/logging.py. However, a dedicated /mcp/tools/log_ingestion_run tool exists for this purpose. To maintain a clear separation of concerns and adhere to the MCP architecture where orchestrators use tools, this trigger router should call the logging tool instead of directly accessing the database. This will centralize the database logic, improve modularity, and reduce code duplication.

gemini-code-assist · 2026-02-14T04:46:36Z

perception_app/mcp_service/routers/trigger.py

+            if hasattr(started_at, "timestamp"):
+                age_seconds = time.time() - started_at.timestamp()
+            else:
+                age_seconds = 0


The else: age_seconds = 0 branch is a potential bug. If started_at is not a valid datetime object for some reason (e.g., data corruption), this code will silently treat the run as brand new, preventing it from ever being cleaned up as stale. This could block all future ingestion runs. It would be more robust to log an error and treat the run as stale to avoid a deadlock.

Suggested change

if hasattr(started_at, "timestamp"):

age_seconds = time.time() - started_at.timestamp()

else:

age_seconds = 0

if hasattr(started_at, "timestamp"):

age_seconds = time.time() - started_at.timestamp()

else:

logger.error(f"Invalid startedAt type for run {doc.id}: {type(started_at)}")

age_seconds = 601 # Mark as stale to be safe

gemini-code-assist · 2026-02-14T04:46:36Z

000-docs/ingestion-run-button-debug.md

+- **Cause**: POST failed (network, CORS, Cloud Run cold start timeout)
+- **Debug**:
+  ```bash
+  curl -X POST https://perception-mcp-w53xszfqnq-uc.a.run.app/trigger/ingestion \


The curl commands in this document use a hardcoded URL (https://perception-mcp-w53xszfqnq-uc.a.run.app). It's a better practice to use a placeholder like <SERVICE_URL> to make the documentation more generic and reusable across different environments (e.g., staging, local development).

Suggested change

curl -X POST https://perception-mcp-w53xszfqnq-uc.a.run.app/trigger/ingestion \

curl -X POST <SERVICE_URL>/trigger/ingestion \

gemini-code-assist · 2026-02-14T04:46:36Z

dashboard/src/components/IngestionButton.tsx

+
+  const pollStatus = useCallback(
+    (runId: string) => {
+      pollRef.current = setInterval(async () => {


Using setInterval for polling can be problematic if a network request takes longer than the interval, as new requests can stack up. A more robust pattern is to use a recursive setTimeout, which ensures that the next poll is only scheduled after the current one has completed or failed.

Example:

const poll = async () => { // ... fetch logic ... // If not in a terminal state, schedule the next poll if (shouldContinuePolling) { setTimeout(poll, POLL_INTERVAL_MS); } }; poll();

gemini-code-assist · 2026-02-14T04:46:36Z

dashboard/src/components/SystemActivityCard.tsx

+  const fetchRuns = async () => {
+    try {
+      const runsRef = collection(db, 'ingestion_runs')
+      const q = query(runsRef, orderBy('startedAt', 'desc'), limit(10))
+      const snapshot = await getDocs(q)
+
+      const runsList = snapshot.docs.map((doc) => ({
+        id: doc.id,
+        ...doc.data()
+      })) as IngestionRun[]
+
+      setRuns(runsList)
+    } catch (err) {
+      console.error('Error fetching ingestion runs:', err)
+      setError(err instanceof Error ? err.message : 'Failed to load activity')
+    } finally {
+      setLoading(false)
    }
+  }

+  useEffect(() => {
    fetchRuns()
  }, [])

+  // Auto-refresh when ingestion completes
+  useEffect(() => {
+    const handler = () => fetchRuns()
+    window.addEventListener('ingestion-complete', handler)
+    return () => window.removeEventListener('ingestion-complete', handler)
+  }, [])


The fetchRuns function is redefined on every render. This can lead to stale closures in the useEffect that listens for the ingestion-complete event and can cause unnecessary re-executions of effects. To ensure correctness and improve performance, you should wrap fetchRuns in useCallback and add it as a dependency to the useEffect hooks that use it.

const fetchRuns = useCallback(async () => { try { const runsRef = collection(db, 'ingestion_runs') const q = query(runsRef, orderBy('startedAt', 'desc'), limit(10)) const snapshot = await getDocs(q) const runsList = snapshot.docs.map((doc) => ({ id: doc.id, ...doc.data() })) as IngestionRun[] setRuns(runsList) } catch (err) { console.error('Error fetching ingestion runs:', err) setError(err instanceof Error ? err.message : 'Failed to load activity') } finally { setLoading(false) } }, []) useEffect(() => { fetchRuns() }, [fetchRuns]) // Auto-refresh when ingestion completes useEffect(() => { const handler = () => fetchRuns() window.addEventListener('ingestion-complete', handler) return () => window.removeEventListener('ingestion-complete', handler) }, [fetchRuns])

gemini-code-assist · 2026-02-14T04:46:36Z

tests/mcp/test_trigger_router.py

+class TestStorageIdempotency:
+    """Tests for deterministic document IDs."""
+
+    def test_same_url_same_hash(self):
+        """Same URL should produce same hash for deduplication."""
+        import hashlib
+
+        url = "https://example.com/article/123"
+        hash1 = hashlib.sha256(url.encode()).hexdigest()[:16]
+        hash2 = hashlib.sha256(url.encode()).hexdigest()[:16]
+        assert hash1 == hash2
+
+    def test_different_urls_different_hashes(self):
+        """Different URLs should produce different hashes."""
+        import hashlib
+
+        url1 = "https://example.com/article/123"
+        url2 = "https://example.com/article/456"
+        hash1 = hashlib.sha256(url1.encode()).hexdigest()[:16]
+        hash2 = hashlib.sha256(url2.encode()).hexdigest()[:16]
+        assert hash1 != hash2


The TestStorageIdempotency class appears to be testing the behavior of Python's standard hashlib library, rather than any application code within the trigger router. Tests should focus on your application's logic. This test class doesn't add value and should be removed to reduce noise in the test suite.

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@000-docs/ingestion-run-button-debug.md`:
- Around line 5-20: The fenced code block in the markdown snippet (the
triple-backtick block showing the Ingestion flow) is missing a language tag
which triggers MD040; update the opening fence from ``` to ```text so the block
becomes a fenced "text" code block, ensuring tools/markdownlint no longer flag
it and preserving the exact content inside the block (refer to the fenced block
that contains "Dashboard (IngestionButton.tsx)" and the POST/Poll/On completion
lines).

In `@dashboard/src/components/IngestionButton.tsx`:
- Around line 1-41: The component IngestionButton currently sets up polling
using pollRef but never clears it on unmount; add a useEffect in the
IngestionButton component that returns a cleanup function which calls
stopPolling() (and optionally clears startTimeRef) to ensure clearInterval is
called and pollRef.current is nulled when the component unmounts, referencing
pollRef and the existing stopPolling callback so the interval cannot fire after
unmount.

In `@dashboard/src/components/SystemActivityCard.tsx`:
- Around line 60-77: The fetchRuns function doesn't clear the previous error
when a refresh succeeds, so after a successful fetch the component may still
show the error state; update fetchRuns to reset the error on success by calling
setError(null) (or empty string depending on your error state type) immediately
after setRuns (or before setLoading(false)), ensuring the state variable
referenced by error is cleared on successful fetch.

In `@tests/api/test_trigger_integration.py`:
- Around line 43-52: The fixture client currently yields TestClient(app)
directly which skips FastAPI lifespan/startup/shutdown events; update the pytest
fixture (the client fixture that calls _mock_firestore and patches
routers.trigger._get_db) to instantiate TestClient(app) inside a with context
(e.g., with TestClient(app) as client:) and yield that client so the context
manager runs lifespan cleanup; apply the same change to the other similar
fixture/blocks around lines 54-78 that create TestClient without a with
statement.

🧹 Nitpick comments (2)

dashboard/src/components/IngestionButton.tsx (1)
7-25: Align RunPhase with backend phases.

PHASE_LABELS includes accepted and initializing, and the backend can emit those values; the union should include them for type safety.
♻️ Suggested fix
 type RunPhase =
   | 'idle'
   | 'starting'
+  | 'accepted'
+  | 'initializing'
   | 'loading_sources'
   | 'fetching_feeds'
   | 'storing_articles'
   | 'upserting_authors'
   | 'done'
perception_app/mcp_service/routers/logging.py (1)

81-108: Use firestore.AsyncClient to avoid blocking the async event loop.

The handler is async (line 63) but uses the synchronous firestore.Client for I/O at line 107 (doc_ref.set()), which blocks the event loop under load. Replace with firestore.AsyncClient and await the Firestore operations, or use await run_in_threadpool(doc_ref.set, ...) if the client must remain synchronous.

000-docs/ingestion-run-button-debug.md

dashboard/src/components/IngestionButton.tsx

dashboard/src/components/SystemActivityCard.tsx

tests/api/test_trigger_integration.py

- Fix memory leak: add useEffect cleanup to stop polling on unmount - Fix stale run deadlock: invalid startedAt now defaults to stale (601s) instead of age_seconds=0 which blocked all future runs - Switch from setInterval to recursive setTimeout to prevent stacked polls - Align RunPhase type with backend phases (add accepted, initializing) - Wrap fetchRuns in useCallback, add fetchRuns to useEffect deps - Clear error state on successful refresh in SystemActivityCard - Use TestClient context manager for proper lifespan cleanup in tests - Remove TestStorageIdempotency (tested stdlib hashlib, not app code) - Add text language tag to fenced code block (MD040) - Replace hardcoded service URL with <SERVICE_URL> placeholder in runbook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@perception_app/mcp_service/config/rss_sources.yaml`:
- Around line 148-152: Remove the duplicate RSS entry that points to
"https://blog.langchain.dev/rss/" (the older redirect) and keep the existing
"LangChain Blog" entry that uses "https://blog.langchain.com/rss/"; locate the
duplicate by matching the name or URL ("LangChain Blog" /
"https://blog.langchain.dev/rss/") in the rss_sources.yaml and delete that block
so only the canonical "https://blog.langchain.com/rss/" entry remains active.

coderabbitai · 2026-02-14T23:31:42Z

perception_app/mcp_service/config/rss_sources.yaml

+  - name: "LangChain Blog"
+    url: "https://blog.langchain.com/rss/"
+    category: "ai"
+    active: true
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Check if both LangChain RSS URLs are accessible and compare their content echo "=== Checking existing LangChain feed (blog.langchain.dev) ===" curl -sI "https://blog.langchain.dev/rss/" | head -10 echo "" echo "=== Checking new LangChain Blog feed (blog.langchain.com) ===" curl -sI "https://blog.langchain.com/rss/" | head -10 echo "" echo "=== Comparing first few article titles from each feed ===" echo "--- blog.langchain.dev ---" curl -s "https://blog.langchain.dev/rss/" 2>/dev/null | grep -oP '(?<=<title>).*?(?=</title>)' | head -5 echo "" echo "--- blog.langchain.com ---" curl -s "https://blog.langchain.com/rss/" 2>/dev/null | grep -oP '(?<=<title>).*?(?=</title>)' | head -5

Repository: intent-solutions-io/perception-with-intent

Length of output: 1114

🏁 Script executed:

#!/bin/bash # Check if the YAML file exists and verify lines 502-505 echo "=== Checking YAML file for lines 502-505 ===" if [ -f "perception_app/mcp_service/config/rss_sources.yaml" ]; then sed -n '500,510p' perception_app/mcp_service/config/rss_sources.yaml else echo "File not found at expected path" fi echo "" echo "=== Following redirect from blog.langchain.dev/rss/ ===" curl -sL -I "https://blog.langchain.dev/rss/" 2>/dev/null | head -15 echo "" echo "=== Getting actual content from blog.langchain.dev following redirects ===" curl -sL "https://blog.langchain.dev/rss/" 2>/dev/null | grep -oP '(?<=<title>).*?(?=</title>)' | head -5

Repository: intent-solutions-io/perception-with-intent

Length of output: 1157

Remove the duplicate "LangChain" entry at lines 502-505.

The entry at lines 148-152 (https://blog.langchain.com/rss/) and the entry at lines 502-505 (https://blog.langchain.dev/rss/) serve identical RSS content. The blog.langchain.dev URL redirects to blog.langchain.com, making it an outdated reference to the same feed. Keep the new entry at lines 148-152 and remove the older one at lines 502-505 to prevent duplicate article fetches.

🤖 Prompt for AI Agents

In `@perception_app/mcp_service/config/rss_sources.yaml` around lines 148 - 152, Remove the duplicate RSS entry that points to "https://blog.langchain.dev/rss/" (the older redirect) and keep the existing "LangChain Blog" entry that uses "https://blog.langchain.com/rss/"; locate the duplicate by matching the name or URL ("LangChain Blog" / "https://blog.langchain.dev/rss/") in the rss_sources.yaml and delete that block so only the canonical "https://blog.langchain.com/rss/" entry remains active.

gemini-code-assist bot reviewed Feb 14, 2026

View reviewed changes

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

jeremylongshore and others added 2 commits February 14, 2026 17:24

feat(sources): add LangChain Blog to AI/ML RSS feeds

62144c2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

jeremylongshore merged commit 0bd7cb6 into main Feb 14, 2026
12 checks passed

jeremylongshore deleted the fix/ingestion-button-reliability branch February 14, 2026 23:31

coderabbitai bot mentioned this pull request Feb 20, 2026

feat: dashboard landing, auto-ingestion, stale article fixes #9

Merged

6 tasks

	curl -X POST https://perception-mcp-w53xszfqnq-uc.a.run.app/trigger/ingestion \
	curl -X POST <SERVICE_URL>/trigger/ingestion \

Conversation

jeremylongshore commented Feb 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gemini-code-assist bot commented Feb 14, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jeremylongshore commented Feb 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 14, 2026 •

edited

Loading