diff --git a/README.md b/README.md index 572819c..2bff551 100644 --- a/README.md +++ b/README.md @@ -5,46 +5,51 @@

Schrute

-Teach your AI a website once. After that, it replays the same backend requests directly — no browser needed. +Teach your AI a website once. After that, it can repeat the job much faster.

-Schrute watches real browser traffic, turns repeatable actions into MCP tools, and reuses browser auth when needed. No hand-written API integration, and often no API keys, because Schrute learns from the requests your browser already knows how to make. +Schrute is for repeated website tasks. -- Faster repeated tasks -- Less brittle than selector-only browser automation -- No hand-written API integration for every site +It watches what happens in a real browser, learns the underlying network requests, and turns them into reusable tools. That means the first run can happen in the browser, but later runs can often skip the UI and go straight to the site's backend. -Measured on repeated runs of tested workflows: +If you keep asking an AI to do the same website task over and over, Schrute is the layer that helps it stop starting from scratch every time. -``` -Execution 1: Browser-proxied fetch ........... 1,029ms -Execution 3: Browser-proxied (warm) ............ 777ms -Execution 5: Browser-proxied (optimized) ....... 273ms -Execution 20+: Direct HTTP (promoted) ........... ~5-50ms -``` +- Learn from a real browser session +- Reuse your logged-in state +- Replay repeatable tasks without brittle click scripts +- Fall back to the browser when direct replay is not possible +- Use it from MCP, CLI, REST, Python, or TypeScript + +## Why People Use It -See [benchmarks](#benchmarks) for methodology. +Without Schrute: -## Why this exists +- An agent opens the site again +- Clicks through the UI again +- Waits for the page again +- Pays the same latency again -Browser agents are great at discovering how to use a site, but terrible at repeating the same task quickly. Every repeat run pays the DOM tax again: page loads, clicks, waits, selectors, retries. +With Schrute: -Schrute keeps the discovery power of browser automation for the first run, then shifts repeatable actions to direct HTTP replay whenever it is safe and reliable to do so. +- You teach it the task once +- Schrute learns the request pattern behind the page +- The next run can often call the learned action directly -That means: -- less latency on repeated calls -- fewer brittle UI steps -- lower runtime cost -- reusable tools instead of one-off automations +That is especially useful for things like: -## Quick start +- pulling the same dashboard data every day +- checking prices or market pages repeatedly +- searching a site with the same flow many times +- reusing internal tools that only work when you are already logged in + +## Quick Start ```bash npm install -g schrute schrute setup ``` -Add to your MCP client config (Claude Code, Cursor, Windsurf, Cline, or any MCP client): +If you want to use Schrute from an AI client over MCP: ```json { @@ -57,333 +62,273 @@ Add to your MCP client config (Claude Code, Cursor, Windsurf, Cline, or any MCP } ``` -Your AI agent now has `schrute_explore`, `schrute_record`, and 40+ other tools. - -## See it work in 60 seconds - -```bash -# Start Schrute -schrute serve - -# Open a browser and record an API interaction -schrute explore https://httpbin.org -schrute record --name get_ip -# navigate to httpbin.org/ip in the browser -schrute stop +## Ways To Use Schrute -# Schrute generates: httpbin_org.get_ip.v1 -# 4 requests captured, 4 signal, 0 noise +You can use the same learned skills in different ways depending on your workflow: -# Execute the learned skill -schrute execute httpbin_org.get_ip.v1 --yes -``` +- **MCP** + Best when you want Claude Code, Cursor, Cline, Windsurf, or another MCP client to call learned website actions as tools. -Result: -- First run: **1,029ms** (browser-proxied fetch) -- Fifth run: **273ms** (browser-proxied, optimized) -- Learned skill: `httpbin_org.get_ip.v1` -- What changed: Schrute learned the exact `GET /ip` endpoint and replays it without DOM interaction +- **CLI** + Best when you want to explore, record, inspect, and run skills manually from the terminal. -## Tested workflows +- **REST API** + Best when you want another app, script, or backend service to call Schrute over HTTP. -Every example below was recorded on 2026-03-17, on macOS (Apple Silicon), over WiFi, using Playwright Chromium. +- **Python and TypeScript clients** + Best when you want a lightweight client package instead of calling raw HTTP endpoints yourself. -### 1. Public API learning — httpbin.org +So Schrute is not tied to one interface. You can teach it a task once, then reuse that same learned task from the interface that fits your workflow. -**User task:** "Get my public IP address" +## First Run In 2 Minutes -**Site:** httpbin.org (developer tools) +```bash +# 1. Start Schrute +schrute serve -**Why this workflow matters:** Shows Schrute learning clean REST endpoints with zero noise and no auth requirements. +# 2. Open a site in a browser session +schrute explore https://httpbin.org -**First run:** -- Path: `schrute explore` → navigate to `/get`, `/ip`, `/user-agent`, `/headers` → `schrute stop` -- Execution mode: browser automation -- Pipeline result: 4 requests captured, 4 signal, 0 noise, **4 skills generated** +# 3. Start recording a task +schrute record --name get_ip -**Learned skills:** -- `httpbin_org.get_ip.v1` — `GET /ip` -- `httpbin_org.get_get.v1` — `GET /get` -- `httpbin_org.get_headers.v1` — `GET /headers` -- `httpbin_org.get_user_agent.v1` — `GET /user-agent` -- Auth used: none (public) -- Safety class: read-only +# 4. In the opened browser, go to: +# https://httpbin.org/ip -**Repeated runs:** +# 5. Stop recording +schrute stop -| Run | Latency | Method | -|-----|--------:|--------| -| 1 | 1,029ms | Browser-proxied (Tier 3) | -| 2 | 1,186ms | Browser-proxied (Tier 3) | -| 3 | 777ms | Browser-proxied (Tier 3) | -| 4 | 1,033ms | Browser-proxied (Tier 3) | -| 5 | 273ms | Browser-proxied (Tier 3) | +# 6. Poll the background pipeline job until skill generation completes +schrute pipeline -**Returned:** `{"origin": "49.43.xxx.x"}` +# 7. Run the learned skill +schrute execute httpbin_org.get_ip.v1 --yes +``` -**What changed after learning:** Four browser navigations became four replayable MCP tools. Each call returns JSON directly — no page load, no DOM parsing, no selectors. +What just happened: ---- +1. Schrute watched the browser traffic for that action. +2. It found the real request behind the page. +3. It saved that request as a reusable skill. +4. You can now run that learned action again without manually driving the page. -### 2. Parameterized API discovery — Wikipedia +## Commands Most People Will Use -**User task:** "Search Wikipedia for articles about artificial intelligence" +```bash +schrute explore https://example.com +schrute record --name my_action +schrute stop +schrute pipeline +schrute execute my_skill.v1 -**Site:** en.wikipedia.org (knowledge/reference) +schrute skills list --status active +schrute skills search "bitcoin price" +schrute skills show -**Why this workflow matters:** Shows Schrute automatically discovering which query parameters vary (the search term) and which are constants (action, format, list type) — without being told. +schrute workflow create --site example.com --name summary --spec '{"steps":[...]}' +schrute workflow run example_com.summary.v1 -**First run:** -- Path: `schrute explore` → navigate to Wikipedia API with `srsearch=machine+learning`, then `srsearch=artificial+intelligence`, then with `titles=Machine_learning` → `schrute stop` -- Pipeline result: 4 requests captured, 4 signal, 0 noise, **2 skills generated** +schrute discover https://api.example.com +schrute doctor +schrute trust +``` -**Learned skills:** -- `en_wikipedia_org.get_api_php.v1` — `GET /w/api.php` - - Discovered parameter: `query.srsearch` (varies between requests — classified as input) - - Baked-in constants: `action=query`, `list=search`, `format=json`, `origin=*` (same across all requests — classified as constants) -- `en_wikipedia_org.create_events.v1` — analytics endpoint, correctly classified as **draft** (not activated) -- Auth used: none (public) -- Safety class: read-only +## What Schrute Can Do Today -**Repeated run:** -- Execution mode: Browser-proxied (Tier 3) -- Time: **1,033ms** -- Returned: Full Wikipedia search results (10 articles with titles, snippets, page IDs) +Schrute is no longer just "record and replay." Here is what the current product does, in practical terms: -**What changed after learning:** A single MCP tool that takes a search query and returns structured Wikipedia results. The agent calls `schrute_execute({ skillId: "en_wikipedia_org.get_api_php.v1", params: { "query.srsearch": "quantum computing" } })` instead of navigating Wikipedia's UI. +- **Learns reusable skills from real browsing** + You do the task once in a browser. Schrute turns what it learned into named actions you can run again later. ---- +- **Generates skills in the background** + When you run `schrute stop`, Schrute does not make you wait for all processing to finish in the foreground. It gives you a pipeline job and keeps building the skills in the background. You can check progress with `schrute pipeline `. -### 3. Noise filtering — dog.ceo +- **Searches and explains what it has already learned** + Once you have multiple skills, Schrute helps you find the right one with `skills search`, inspect it with `skills show`, validate it, export it, and manage it without digging through raw data. -**User task:** "List all dog breeds and get a random dog image" +- **Builds workflows from multiple skills** + If one reusable action is not enough, Schrute can chain several read-only skills together into a larger workflow. That is useful for multi-step tasks like "get account info, then fetch usage, then return a summary." -**Site:** dog.ceo (entertainment/fun API) +- **Discovers APIs even before you record** + Schrute can scan a site for useful backend clues such as OpenAPI specs, GraphQL endpoints, sitemaps, platform fingerprints, and WebMCP tools. That helps you start faster on sites that already expose a structured backend. -**Why this workflow matters:** Shows Schrute separating real API calls from page chrome (CSS, images, scripts) on a site that mixes both. +- **Reuses the browser session you already trust** + If you are already logged into Chrome or an Electron app, Schrute can attach to that session instead of forcing you through login again. This is especially useful for internal tools and dashboards. -**First run:** -- Path: `schrute explore` → navigate to `/api/breeds/image/random`, `/api/breeds/list/all`, `/api/breed/labrador/images/random` → `schrute stop` -- Pipeline result: 6 requests captured, **3 signal, 3 noise**, **2 skills generated** +- **Supports more than one browser session** + You are not limited to one browser context. Schrute can manage multiple named sessions so different sites, accounts, or attached browsers do not all get mixed together. -**Learned skills:** -- `dog_ceo.get_all.v1` — `GET /api/breeds/list/all` -- `dog_ceo.get_random.v1` — `GET /api/breed/labrador/images/random` -- Auth used: none (public) -- Safety class: read-only +- **Handles sites that still need a live browser** + Some sites cannot be cleanly replayed as direct HTTP calls because of Cloudflare, anti-bot checks, or other browser-only behavior. Schrute does not pretend otherwise. It keeps those tasks on a browser-backed path so they still work. -**Repeated runs:** +- **Lets you call the same learned skills from different places** + The same learned actions can be used from MCP, the CLI, REST, and the Python or TypeScript client packages. That means you do not have to re-teach the task separately for each integration. -| Run | Skill | Latency | Returned | -|-----|-------|--------:|----------| -| 1 | get_all | 551ms | Full breed list (98 breeds with sub-breeds) | -| 2 | get_random | 558ms | `https://images.dog.ceo/breeds/labrador/Fury_02.jpg` | -| 3 | get_random | 472ms | `https://images.dog.ceo/breeds/labrador/n02099712_1414.jpg` | +- **Lets you move and maintain what you learned** + Schrute can export and import learned site bundles, run health checks with `doctor`, show a trust posture report with `trust`, and keep an audit trail of executions. -**What changed after learning:** The 3 noise requests (page CSS, favicon, scripts) were discarded. Only the 3 JSON API calls became skills. Each execution returns structured JSON in ~500ms instead of loading the full dog.ceo web page. +- **Can improve and maintain learned actions over time** + Schrute can validate skills, track amendments, run optimization on degraded skills, and keep using safer fallback paths when a direct path stops being reliable. ---- +- **Can work with site-declared tools as well as learned traffic** + On some sites, Schrute can discover useful backend structure such as WebMCP tools, OpenAPI specs, or GraphQL endpoints in addition to what it learns from browser traffic. -### 4. Cloudflare-protected site — CoinGecko +## Feature Overview -**User task:** "Get Bitcoin 24-hour price data" +If you are trying to understand "what is actually included here?", this is the practical feature map: -**Site:** www.coingecko.com (finance/crypto) +- **Explore and record** + Open a site, perform an action, and let Schrute watch the traffic behind it. -**Why this workflow matters:** Shows how Schrute handles sites behind Cloudflare. Direct HTTP fails — the skill stays at Tier 3 (browser-proxied) and uses the browser's Cloudflare clearance cookies. +- **Background processing** + Generate skills after recording without blocking the terminal. -**First run:** -- Path: `schrute explore` → agent navigates CoinGecko, clicks on Bitcoin, views price charts → `schrute stop` -- Pipeline result: **5 skills generated** from the captured API calls +- **Skill catalog** + List, search, inspect, validate, export, revoke, delete, and manage learned skills. -**Learned skills:** -- `www_coingecko_com.get_24_hours_json.v1` — `GET /price_charts/bitcoin/usd/24_hours.json` -- `www_coingecko_com.get_max_longer_cache_json.v1` — `GET /price_charts/bitcoin/usd/max_longer_cache.json` -- `www_coingecko_com.get_insight_annotations.v1` — `GET /price_charts/bitcoin/insight_annotations` -- Plus 2 more (user info, OTP center) -- Auth used: Cloudflare cookies (browser session) -- Safety class: read-only +- **Execution** + Run learned skills directly from CLI, MCP, REST, or client SDKs. -**Direct HTTP attempt:** Failed after 9,129ms — Cloudflare returns a challenge page, not JSON. +- **Workflow building** + Combine multiple read-only skills into one higher-level reusable flow. -**Why it still works:** At Tier 3, Schrute executes the `fetch()` inside the browser context, which already has Cloudflare clearance cookies. The request succeeds where direct HTTP cannot. This skill will not promote to Tier 1 because the endpoint requires Cloudflare cookies — Schrute detects this and keeps it at the browser-proxied tier. +- **Discovery** + Scan a site for OpenAPI, GraphQL, sitemaps, platform patterns, WebMCP tools, and other useful backend signals. -**What this shows:** Not every skill promotes to direct HTTP. Schrute adapts to the site's security model instead of breaking against it. +- **Browser session reuse** + Attach to a browser you already have open and logged into. ---- +- **Multi-session support** + Keep separate browser sessions for different sites, accounts, or experiments. -### 5. Server-rendered site — Hacker News (no skills generated) +- **Fallback execution** + Keep browser-backed execution for sites that cannot safely or reliably use direct replay. -**User task:** "Get the front page of Hacker News" +- **Import and export** + Move learned site bundles between environments without shipping credentials. -**Site:** news.ycombinator.com (news/tech) +- **Operational tools** + Use doctor, trust reporting, audit logs, and pipeline status to understand what Schrute is doing. -**Why this matters:** Shows what happens when Schrute encounters a site that does not use JSON APIs behind its UI. +- **Client access** + Use the same learned actions through MCP, CLI, REST, Python, and TypeScript. -**Result:** -- 12 requests captured -- 0 signal, 10 noise (CSS, images, static assets), 2 document navigations -- **0 skills generated** +## How Schrute Runs A Task -**Why:** Hacker News is fully server-rendered HTML. There are no `fetch()` calls to JSON APIs — every page is a full HTML document. Schrute correctly identifies there is nothing to learn and does not generate broken skills. +Schrute tries to use the simplest reliable path: -## Benchmarks +1. **Browser first** while the task is still being learned +2. **Direct replay later** when the request is stable and safe to reuse +3. **Browser fallback** when the site truly requires a live browser -| Site | Skill | Run 1 | Run 3 | Run 5 | Auth | Noise filtered | -|------|-------|------:|------:|------:|------|---------------:| -| httpbin.org | `get_ip` | 1,029ms | 777ms | 273ms | None | 0/4 | -| dog.ceo | `get_all` | 551ms | — | — | None | 3/6 | -| dog.ceo | `get_random` | 558ms | 472ms | — | None | 3/6 | -| en.wikipedia.org | `get_api_php` | 1,033ms | — | — | None | 0/4 | -| www.coingecko.com | `get_24_hours_json` | 9,129ms (fail) | — | — | Cloudflare cookies | — | +So the goal is not "force everything into direct HTTP." The goal is "use the fastest safe execution mode that actually works." -All runs at Tier 3 (browser-proxied). Skills promote to Tier 1 (direct HTTP, ~5-50ms) after 5+ consecutive successful validations. Cloudflare-protected skills remain at Tier 3. +That is why sites behind Cloudflare or other anti-bot systems can still be useful in Schrute. If direct replay is blocked, Schrute keeps them on a browser-backed path instead of pretending they should work the same way as a public API. -**Methodology:** -- Machine: MacBook (Apple Silicon) -- Network: WiFi, India -- Browser engine: Playwright Chromium -- Cache state: warm (browser session open) -- Timing: `latencyMs` field from Schrute execution result -- Date tested: 2026-03-17 +## Where It Fits Best -## Where Schrute works best +Schrute is a strong fit when: -Schrute works best when: -- the site uses JSON, GraphQL, or predictable HTTP requests behind the UI -- the task is repeated often enough to justify learning -- the browser session already has the right auth state -- the workflow can be represented as a stable request or request chain +- the site makes predictable HTTP or JSON requests behind the UI +- the task is repeated often +- you already have the right browser auth state +- you want reusable tools instead of one-off browser scripts -Schrute is a worse fit when: -- the site is fully server-rendered HTML with no JSON API calls (like Hacker News) -- the workflow depends heavily on WebSockets, canvas state, or anti-bot challenges on every request -- the task is a one-off and not worth recording -- the action is too risky to replay automatically (destructive mutations without confirmation) +Schrute is a weaker fit when: -## How Schrute differs from other approaches +- the site is mostly server-rendered HTML with no meaningful backend calls to learn +- the workflow depends heavily on canvas, WebSockets, or visual-only interactions +- the task is truly one-time and not worth teaching -### Browser-only agents (Playwright, Puppeteer, Selenium) -Great for first-time discovery, slower and more brittle for repeated workflows. Every repeat pays the full DOM tax. +## Common Use Cases -### Hand-written API integrations -Fast and reliable once built, but require documentation, API keys, auth handling, and custom code per site. +Schrute is especially useful for: -### Schrute -Uses the browser to discover the workflow once, then promotes repeatable actions toward direct HTTP replay. No per-site code. Auth comes from the browser session. Skills self-heal when APIs change. +- **Repeated internal dashboard checks** + Example: pull the same account, usage, or reporting view every day without re-clicking the whole UI. -## Trust and safety +- **Logged-in business tools** + Example: use your existing browser session to access an internal admin panel, support tool, CMS, or analytics product. -Schrute does not blindly replay arbitrary browser traffic. +- **Price, market, and listing lookups** + Example: repeatedly fetch the same market page or structured data endpoint after teaching the browser path once. -Before a learned skill executes, Schrute enforces: -- **Domain allowlists** — only approved domains, SSRF prevention -- **Method restrictions** — GET/HEAD by default, mutations require approval -- **One-time confirmation** — every new skill needs user approval before first execution -- **Path risk heuristics** — destructive-looking paths (`/delete`, `/logout`) are blocked -- **Rate limiting** — 10 QPS default per site -- **Redirect validation** — every redirect hop must pass policy -- **Audit logging** — every execution recorded to SQLite +- **Search and lookup workflows** + Example: teach a site search flow once, then reuse it with different inputs. -Dangerous browser tools (`browser_evaluate`, `browser_run_code`) are blocked entirely. +- **Agent tool creation** + Example: turn a repeated browser task into a reusable MCP tool for an AI coding or operations workflow. -For the full 9-gate security model, see [SECURITY.md](SECURITY.md). +- **Multi-step read-only automations** + Example: fetch one piece of data, use it in a second call, and return a final combined answer through a workflow skill. -## Auth, cookies, and storage +- **Sites with a mix of easy and hard paths** + Example: let Schrute use direct replay where it works, but keep a live-browser fallback for the parts that truly need it. -Schrute reuses the auth state already present in your browser session. +## Reusing Your Logged-In Browser -- Cookies are stored in the **OS keychain** (macOS Keychain, Linux Secret Service) — not in plaintext files -- Exported skill bundles never include credentials -- Audit logs are stored locally in SQLite -- Auth detection is automatic: Bearer tokens, API keys, OAuth2, session cookies -- JWT TTLs are extracted and used for proactive refresh +If you already have a browser session with the right login state, Schrute can attach to it instead of making you sign in again. -Connect to an existing Chrome session to reuse your logged-in state: +Typical pattern: ```bash -# Launch Chrome with debugging chrome --remote-debugging-port=9222 - -# Connect Schrute -schrute_connect_cdp({ port: 9222, name: "my-chrome" }) ``` -## Self-healing - -APIs change. Auth tokens expire. Schrute handles this automatically. +After that, Schrute can connect to the running browser through CDP using its MCP or REST surfaces. -A background validation loop runs every 10 minutes: -- Tests skills with last-known-good parameters -- Schema drift detected → re-infer the response schema -- Auth expired → refresh via browser re-login -- Missing parameter → add it from recent traffic -- Still failing → escalate to browser tier (fall back to safety) -- Permanently broken → mark as stale, stop executing +This is especially useful for: -Every amendment is applied on cooldown, evaluated over a test window, and rolled back if it makes things worse. +- internal dashboards +- admin tools +- sites with multi-step login flows +- flows where the browser already has the right cookies and session state -## Cold-start discovery +## REST API And SDKs -Schrute can discover APIs without recording anything: - -```bash -schrute discover https://api.example.com -``` - -It probes for OpenAPI specs, GraphQL introspection, platform signatures (Shopify, Stripe, WordPress, Firebase, Supabase, Next.js), sitemaps, and WebMCP tools. Discovered endpoints become draft skills ranked by trust level. - -## REST API and client SDKs - -Start Schrute with HTTP transport for programmatic access from any language: +If you want to call Schrute from scripts or apps: ```bash schrute config set server.authToken my-secret schrute serve --http --port 3000 ``` -```bash -# Execute a learned skill -curl -X POST http://127.0.0.1:3000/api/sites/httpbin.org/skills/get_ip \ - -H "Authorization: Bearer my-secret" \ - -H "Content-Type: application/json" \ - -d '{"params": {}}' +Then call it over HTTP: -# Search for skills -curl -X POST http://127.0.0.1:3000/api/v1/skills/search \ +```bash +curl -X POST http://127.0.0.1:3000/api/v1/execute \ -H "Authorization: Bearer my-secret" \ -H "Content-Type: application/json" \ - -d '{"query": "dog breeds", "limit": 5}' + -d '{"skillId":"httpbin_org.get_ip.v1","params":{}}' ``` -Client SDKs available for **Python** (zero-dependency, `pip install schrute-client`) and **TypeScript** (`npm install @schrute/client`). +Client packages: -Full REST API reference: [docs/rest-api.md](docs/rest-api.md) +- TypeScript: `npm install @schrute/client` +- Python: `pip install schrute-client` -## Docs +MCP HTTP is also available at: -For detailed reference documentation: +```text +http://127.0.0.1:3001/mcp +``` + +## Safety And Storage -- [MCP tools reference](docs/tools.md) — All 40+ MCP tools with parameters -- [CLI reference](docs/cli.md) — Every CLI command and flag -- [REST API](docs/rest-api.md) — 19 HTTP endpoints with examples -- [Security model](SECURITY.md) — Full 9-gate policy engine -- [Architecture](docs/architecture.md) — System design and internals -- [Configuration](docs/configuration.md) — Environment variables, config file, precedence -- [Development](docs/development.md) — Building from source, testing, project structure -- [Client SDKs](docs/sdks.md) — Python and TypeScript usage +Schrute does not blindly replay everything it sees. -
-Multi-client setup +Before a learned skill runs, Schrute applies safeguards such as: -Works with Claude Code (`.mcp.json`), Claude Desktop, Cursor (`.cursor/mcp.json`), Windsurf (`.codeium/windsurf/mcp_config.json`), Cline, Continue, or any MCP client via stdio: `npx -y schrute serve` -
+- domain allowlists +- redirect validation +- method and path checks +- approval for first execution when needed +- audit logging +- rate limiting -
-Claude Code plugin +Credentials are not exported with skill bundles, and dangerous raw browser execution tools are blocked. -When installed as a plugin: `/schrute:explore`, `/schrute:record`, `/schrute:skills`, `/schrute:doctor`, `/schrute:status`. Includes specialized agents for skill validation, exploration guidance, and debugging. -
+For the full security model, see [SECURITY.md](SECURITY.md). ## Development @@ -391,15 +336,24 @@ When installed as a plugin: `/schrute:explore`, `/schrute:record`, `/schrute:ski ```bash git clone https://github.com/sheeki03/schrute.git -cd schrute && npm install && npx playwright install chromium && npm run build +cd schrute +npm install +npm run build ``` +Useful commands: + ```bash -npm run build # Compile TypeScript -npx vitest run # Run tests -npm run dev # Watch mode +npm run build +npm test +npm run dev ``` +## More + +- [SECURITY.md](SECURITY.md) +- [CONTRIBUTING.md](CONTRIBUTING.md) + ## License [Apache-2.0](LICENSE) diff --git a/bin/schrute.cjs b/bin/schrute.cjs new file mode 100755 index 0000000..6cb9002 --- /dev/null +++ b/bin/schrute.cjs @@ -0,0 +1,8 @@ +#!/usr/bin/env node +'use strict'; +const major = parseInt(process.version.slice(1), 10); +if (major < 22) { + console.error(`Error: Node >= 22 required (found ${process.version}). Run: nvm use 22`); + process.exit(1); +} +import('../dist/index.js'); diff --git a/package.json b/package.json index a707dfd..64b8c8e 100644 --- a/package.json +++ b/package.json @@ -15,10 +15,10 @@ } }, "bin": { - "schrute": "dist/index.js" + "schrute": "bin/schrute.cjs" }, "scripts": { - "prebuild": "node scripts/sync-version.js", + "prebuild": "node -e \"if(parseInt(process.version.slice(1))<22){console.error('Node>=22 required');process.exit(1)}\" && node scripts/sync-version.js", "build": "tsc -p tsconfig.json", "prepublishOnly": "node scripts/sync-version.js && npm run build", "build:native": "cd native && cargo build --release && cp target/release/libschrute_native.dylib index.node 2>/dev/null; cp target/release/schrute_native.dll index.node 2>/dev/null; cp target/release/libschrute_native.so index.node 2>/dev/null; echo 'Native build complete'", @@ -27,10 +27,10 @@ "test:watch": "vitest", "test:coverage": "vitest run --coverage", "lint": "tsc --noEmit", - "start": "node dist/index.js", - "serve": "node dist/index.js serve", - "setup": "node dist/index.js setup", - "doctor": "node dist/index.js doctor", + "start": "node bin/schrute.cjs", + "serve": "node bin/schrute.cjs serve", + "setup": "node bin/schrute.cjs setup", + "doctor": "node bin/schrute.cjs doctor", "rebuild:native": "bash scripts/rebuild-native.sh", "build:binary": "npm run build && pkg dist/index.js --config pkg.config.json", "build:binary:macos": "npm run build:binary -- --target node22-macos-arm64", @@ -39,11 +39,13 @@ }, "dependencies": { "@modelcontextprotocol/sdk": "^1.12.1", - "better-sqlite3": "^11.8.1", + "better-sqlite3": "^12.8.0", + "cheerio": "^1.2.0", "commander": "^13.1.0", "fastify": "^5.2.1", "ipaddr.js": "^2.2.0", "js-yaml": "^4.1.1", + "jsonpath-plus": "^10.4.0", "keytar": "^7.9.0", "patchright": "^1.57.0", "pdf-parse": "^2.4.5", @@ -91,6 +93,7 @@ "schrute" ], "files": [ + "bin/", "dist/", ".claude-plugin/", "commands/", diff --git a/src/app/import-service.ts b/src/app/import-service.ts new file mode 100644 index 0000000..f7a6182 --- /dev/null +++ b/src/app/import-service.ts @@ -0,0 +1,261 @@ +import * as fs from 'node:fs'; +import * as readline from 'node:readline'; +import type { SkillSpec, SiteManifest, SitePolicy } from '../skill/types.js'; +import { + validateImportableSkill, + validateImportableSite, + validateAndNormalizeImportablePolicy, +} from '../storage/import-validator.js'; +import { getSitePolicy, setSitePolicy } from '../core/policy.js'; +import type { SkillRepository } from '../storage/skill-repository.js'; +import type { SiteRepository } from '../storage/site-repository.js'; +import type { AgentDatabase } from '../storage/database.js'; +import type { SchruteConfig } from '../core/config.js'; + +export interface ImportDeps { + db: AgentDatabase; + skillRepo: SkillRepository; + siteRepo: SiteRepository; + config: SchruteConfig; +} + +export interface ImportOptions { + yes?: boolean; +} + +export interface ImportResult { + created: number; + updated: number; + skipped: number; + siteAction?: 'created' | 'updated'; + hasAuthSkills: boolean; + policyWarnings: string[]; + cancelled?: boolean; +} + +export async function performImport( + file: string, + deps: ImportDeps, + options: ImportOptions = {}, +): Promise { + if (!fs.existsSync(file)) { + throw new Error(`File '${file}' not found.`); + } + + let bundle: { + version: string; + site: SiteManifest; + skills: SkillSpec[]; + policy?: SitePolicy; + }; + + try { + const raw = fs.readFileSync(file, 'utf-8'); + bundle = JSON.parse(raw); + } catch (err) { + throw new Error(`Failed to parse bundle: ${err instanceof Error ? err.message : String(err)}`); + } + + if (!bundle.site || !bundle.skills || !Array.isArray(bundle.skills)) { + throw new Error('Invalid bundle format: missing site or skills.'); + } + + // Validate site + const siteResult = validateImportableSite(bundle.site); + if (!siteResult.valid) { + throw new Error(`Site validation failed:\n ${siteResult.errors.join('\n ')}`); + } + + // Validate each skill; warn + skip invalid ones + const validSkills: SkillSpec[] = []; + const skipped: string[] = []; + const expectedSiteId = bundle.site.id; + + for (const skill of bundle.skills) { + const skillResult = validateImportableSkill(skill); + if (!skillResult.valid) { + const label = (skill as unknown as Record).id ?? '(unknown)'; + console.warn( + `Warning: skill '${label}' failed validation -- skipping.\n ${skillResult.errors.join('\n ')}`, + ); + skipped.push(String(label)); + continue; + } + + if (Array.isArray(skill.allowedDomains) && skill.allowedDomains.length === 0) { + console.warn( + `Warning: skill '${skill.id}' has no allowedDomains -- may not execute without a domain policy.`, + ); + } + + if (skill.siteId !== expectedSiteId) { + console.warn( + `Warning: skill '${skill.id}' has siteId '${skill.siteId}', expected '${expectedSiteId}'. Skipping.`, + ); + skipped.push(skill.id); + continue; + } + + validSkills.push(skill); + } + + // Check for overwrites — track corrupt rows separately + const { db, skillRepo, siteRepo } = deps; + let existingSite: SiteManifest | undefined; + let siteCorrupt = false; + try { + existingSite = siteRepo.getById(bundle.site.id); + } catch { + siteCorrupt = true; + console.warn(`Warning: existing site '${bundle.site.id}' has corrupt data — will overwrite.`); + } + + const overwriteIds: string[] = []; + const corruptIds: string[] = []; + const existingCreatedAt = new Map(); + let newCount = 0; + for (const skill of validSkills) { + try { + const existing = skillRepo.getById(skill.id); + if (existing) { + overwriteIds.push(skill.id); + if (existing.createdAt) existingCreatedAt.set(skill.id, existing.createdAt); + } else { + newCount++; + } + } catch { + corruptIds.push(skill.id); + console.warn(`Warning: existing skill '${skill.id}' has corrupt data — will overwrite.`); + } + } + const existingCount = overwriteIds.length + corruptIds.length; + + // Preview + console.log(`Import preview for '${file}':`); + console.log(` Site: ${bundle.site.id} (${existingSite ? 'will update' : 'will create'})`); + console.log(` Valid skills: ${validSkills.length}`); + if (skipped.length > 0) { + console.log(` Skipped (invalid): ${skipped.length}`); + } + if (existingCount > 0) { + console.log(` Will overwrite: ${existingCount} existing skill(s)`); + for (const id of overwriteIds) console.log(` overwrite: ${id}`); + for (const id of corruptIds) console.log(` overwrite (corrupt): ${id}`); + } + + // Policy preview + const policyWarnings: string[] = []; + if (bundle.policy) { + console.log(` Policy: will ${existingSite ? 'replace' : 'set'}`); + const currentPolicy = getSitePolicy(bundle.site.id, deps.config); + if (bundle.policy.maxConcurrent !== currentPolicy.maxConcurrent) { + policyWarnings.push(`maxConcurrent: current=${currentPolicy.maxConcurrent}, import=${bundle.policy.maxConcurrent}`); + } + } + if (policyWarnings.length > 0) { + console.log(` Policy changes: ${policyWarnings.join('; ')}`); + } + + // Confirmation — require when anything will be overwritten + if ((existingCount > 0 || existingSite || siteCorrupt) && !options.yes) { + if (!process.stdin.isTTY) { + throw new Error('Non-interactive terminal: use --yes to confirm import.'); + } + const rl = readline.createInterface({ input: process.stdin, output: process.stdout }); + const answer = await new Promise(resolve => rl.question('Proceed with import? [y/N] ', resolve)); + rl.close(); + if (answer.toLowerCase() !== 'y') { + return { created: 0, updated: 0, skipped: skipped.length, hasAuthSkills: false, policyWarnings, cancelled: true }; + } + } + + // Fill defaults for NOT NULL DB fields + const now = Date.now(); + for (const skill of validSkills) { + if (!skill.name) { + const parts = skill.id.split('.'); + skill.name = parts.length >= 2 ? parts[parts.length - 2] : skill.id; + } + if (skill.inputSchema === undefined) skill.inputSchema = {}; + if (skill.sideEffectClass === undefined) skill.sideEffectClass = 'read-only'; + if (skill.currentTier === undefined) skill.currentTier = 'tier_3'; + if (skill.status === undefined) skill.status = 'draft'; + if (skill.confidence === undefined) skill.confidence = 0; + if (skill.consecutiveValidations === undefined) skill.consecutiveValidations = 0; + if (skill.sampleCount === undefined) skill.sampleCount = 0; + if (skill.successRate === undefined) skill.successRate = 0; + if (skill.version === undefined) skill.version = 1; + if (skill.allowedDomains === undefined) skill.allowedDomains = []; + if (skill.isComposite === undefined) skill.isComposite = false; + if (skill.directCanaryEligible === undefined) skill.directCanaryEligible = false; + if (skill.directCanaryAttempts === undefined) skill.directCanaryAttempts = 0; + if (skill.validationsSinceLastCanary === undefined) skill.validationsSinceLastCanary = 0; + if (skill.createdAt === undefined) { + skill.createdAt = existingCreatedAt.get(skill.id) ?? now; + } + if (skill.updatedAt === undefined) skill.updatedAt = now; + } + + // Phase 1: Site + skills in a single synchronous transaction + const corruptSet = new Set(corruptIds); + const overwriteSet = new Set(overwriteIds); + let created = 0; + let updated = 0; + let siteAction: 'created' | 'updated'; + + db.transaction(() => { + if (existingSite && !siteCorrupt) { + siteRepo.update(bundle.site.id, bundle.site); + siteAction = 'updated'; + } else { + // Delete corrupt/stale row (cascade may delete skills too) + try { siteRepo.delete(bundle.site.id); } catch { /* row may not exist */ } + siteRepo.create(bundle.site); + siteAction = 'created'; + } + + if (siteCorrupt) { + // Site was deleted+recreated → cascade killed all skills → all are creates + for (const skill of validSkills) { + skillRepo.create(skill); + created++; + } + } else { + for (const skill of validSkills) { + if (corruptSet.has(skill.id)) { + try { skillRepo.delete(skill.id); } catch { /* may already be gone */ } + skillRepo.create(skill); + updated++; + } else if (overwriteSet.has(skill.id)) { + skillRepo.update(skill.id, skill); + updated++; + } else { + skillRepo.create(skill); + created++; + } + } + } + }); + + // Phase 2: Policy (separate write — setSitePolicy does its own DB call) + if (bundle.policy) { + const policyResult = validateAndNormalizeImportablePolicy(bundle.policy, bundle.site.id); + if (!policyResult.valid || !policyResult.value) { + console.error(`Warning: policy import failed validation: ${policyResult.errors.join('; ')}`); + } else { + const p = policyResult.value; + try { + const result = setSitePolicy(p, deps.config); + if (!result.persisted) { + console.error('Warning: policy imported to cache but failed to persist to DB.'); + } + } catch (err) { + console.error(`Warning: policy import failed: ${err instanceof Error ? err.message : String(err)}`); + } + } + } + + const hasAuthSkills = validSkills.some((s: SkillSpec) => s.authType != null); + + return { created, updated, skipped: skipped.length, siteAction: siteAction!, hasAuthSkills, policyWarnings }; +} diff --git a/src/app/service.ts b/src/app/service.ts index c1decb8..22db8ee 100644 --- a/src/app/service.ts +++ b/src/app/service.ts @@ -14,6 +14,7 @@ import { shouldAutoConfirm } from '../server/skill-helpers.js'; type ExecuteSkillResult = | { status: 'executed'; result: SkillExecutionResult } + | { status: 'browser_handoff_required'; result: SkillExecutionResult } | { status: 'confirmation_required'; skillId: string; @@ -70,6 +71,8 @@ interface ExportedCookie { export interface PipelineJobResult { skillsGenerated: number; signalCount: number; + htmlDocumentCount?: number; + ambiguousCount?: number; noiseCount: number; totalCount: number; warning?: string; @@ -185,6 +188,9 @@ export class SchruteService { } const result = await this.deps.engine.executeSkill(skillId, params, callerId); + if (result.status === 'browser_handoff_required') { + return { status: 'browser_handoff_required', result }; + } return { status: 'executed', result }; } @@ -263,7 +269,7 @@ export class SchruteService { listSessions(): SessionInfo[] { const msm = this.deps.engine.getMultiSessionManager(); const activeName = msm.getActive(); - return msm.list().map(s => ({ + return msm.list(undefined, this.deps.config, { includeInternal: false }).map(s => ({ name: s.name, siteId: s.siteId, isCdp: s.isCdp, diff --git a/src/automation/classifier.ts b/src/automation/classifier.ts index 83d17ef..f98724f 100644 --- a/src/automation/classifier.ts +++ b/src/automation/classifier.ts @@ -98,8 +98,9 @@ export function classifySite( // JS-computed fields require full browser for replay recommendedTier = ExecutionTier.FULL_BROWSER; } else if (authRequired) { - // Auth (with or without dynamic fields) typically needs cookie refresh tier - recommendedTier = ExecutionTier.COOKIE_REFRESH; + // Auth-required traffic should stay on the browser-backed path until + // a real persistent cookie-refresh tier exists. + recommendedTier = ExecutionTier.BROWSER_PROXIED; } else if (graphqlDetected) { // GraphQL without auth — direct fetch may work recommendedTier = ExecutionTier.DIRECT; diff --git a/src/automation/rate-limiter.ts b/src/automation/rate-limiter.ts index 234b6a6..46f9c4c 100644 --- a/src/automation/rate-limiter.ts +++ b/src/automation/rate-limiter.ts @@ -19,11 +19,20 @@ interface RateCheckResult { retryAfterMs?: number; } +interface RateCheckOptions { + minGapMs?: number; +} + +interface WaitForPermitOptions extends RateCheckOptions { + timeoutMs?: number; +} + interface SiteBucket { tokens: number; maxTokens: number; refillRate: number; // tokens per second lastRefill: number; + lastGrantedAt: number; backoffUntil: number; backoffMultiplier: number; latencyEwa: number; @@ -95,63 +104,78 @@ export class RateLimiter { * 1. Global site bucket — protects upstream API (shared across all callers) * 2. Per-caller sub-bucket — ensures fairness (one caller can't exhaust the budget) */ - checkRate(siteId: string, callerId?: string): RateCheckResult { + checkRate(siteId: string, callerId?: string, options?: RateCheckOptions): RateCheckResult { + return this.tryAcquirePermit(siteId, callerId, options); + } + + async waitForPermit( + siteId: string, + callerId?: string, + options?: WaitForPermitOptions, + ): Promise { + const timeoutMs = Math.max(options?.timeoutMs ?? 30_000, 0); + const deadline = Date.now() + timeoutMs; + + while (true) { + const result = this.tryAcquirePermit(siteId, callerId, options); + if (result.allowed) { + return result; + } + + const retryAfterMs = Math.max(Math.ceil(result.retryAfterMs ?? 0), 1); + if (Date.now() + retryAfterMs > deadline) { + return result; + } + + await new Promise((resolve) => setTimeout(resolve, retryAfterMs)); + } + } + + /** + * Two-tier rate check: + * 1. Global site bucket — protects upstream API (shared across all callers) + * 2. Per-caller sub-bucket — ensures fairness (one caller can't exhaust the budget) + */ + private tryAcquirePermit( + siteId: string, + callerId?: string, + options?: RateCheckOptions, + ): RateCheckResult { + const minGapMs = Math.max(Math.floor(options?.minGapMs ?? 0), 0); + // 1. Check global site bucket first — protects upstream const siteBucket = this.getOrCreateSiteBucket(siteId); this.refillTokens(siteBucket); const now = Date.now(); + let retryAfterMs = this.getBucketRetryAfterMs(siteBucket, now); + retryAfterMs = Math.max(retryAfterMs, this.getMinGapRetryAfterMs(siteBucket, now, minGapMs)); - // Check site-level backoff - if (now < siteBucket.backoffUntil) { - const retryAfterMs = siteBucket.backoffUntil - now; - log.debug( - { siteId, retryAfterMs }, - 'Rate limited: in backoff period', - ); - return { allowed: false, retryAfterMs }; + // 2. If callerId provided, also check per-caller sub-bucket + let callerBucket: SiteBucket | undefined; + if (callerId) { + const callerKey = `${siteId}::${callerId}`; + callerBucket = this.getOrCreateCallerBucket(callerKey, siteBucket); + this.refillTokens(callerBucket); + retryAfterMs = Math.max(retryAfterMs, this.getBucketRetryAfterMs(callerBucket, now)); } - // Check site-level tokens - if (siteBucket.tokens < 1) { - const retryAfterMs = Math.ceil((1 - siteBucket.tokens) / siteBucket.refillRate * 1000); + if (retryAfterMs > 0) { log.debug( - { siteId, tokens: siteBucket.tokens, retryAfterMs }, - 'Rate limited: insufficient tokens', + { siteId, callerId, retryAfterMs, minGapMs }, + 'Rate limited: permit unavailable', ); return { allowed: false, retryAfterMs }; } - // 2. If callerId provided, also check per-caller sub-bucket - if (callerId) { - const callerKey = `${siteId}::${callerId}`; - const callerBucket = this.getOrCreateCallerBucket(callerKey, siteBucket); - this.refillTokens(callerBucket); - - if (now < callerBucket.backoffUntil) { - const retryAfterMs = callerBucket.backoffUntil - now; - log.debug( - { siteId, callerId, retryAfterMs }, - 'Rate limited: caller in backoff period', - ); - return { allowed: false, retryAfterMs }; - } - - if (callerBucket.tokens < 1) { - const retryAfterMs = Math.ceil((1 - callerBucket.tokens) / callerBucket.refillRate * 1000); - log.debug( - { siteId, callerId, tokens: callerBucket.tokens, retryAfterMs }, - 'Rate limited: caller insufficient tokens', - ); - return { allowed: false, retryAfterMs }; - } - - // Consume from caller bucket + if (callerBucket) { callerBucket.tokens -= 1; + callerBucket.lastGrantedAt = now; } // 3. Consume from global bucket siteBucket.tokens -= 1; + siteBucket.lastGrantedAt = now; return { allowed: true }; } @@ -275,6 +299,7 @@ export class RateLimiter { maxTokens: Math.max(Math.ceil(this.defaultQps * BURST_CAPACITY_MULTIPLIER), 1), refillRate: this.defaultQps, lastRefill: Date.now(), + lastGrantedAt: 0, backoffUntil: 0, backoffMultiplier: INITIAL_BACKOFF_MULTIPLIER, latencyEwa: 0, @@ -294,6 +319,7 @@ export class RateLimiter { maxTokens, refillRate: Math.max(0.1, siteBucket.refillRate * this.callerFraction), lastRefill: Date.now(), + lastGrantedAt: 0, backoffUntil: 0, backoffMultiplier: INITIAL_BACKOFF_MULTIPLIER, latencyEwa: 0, @@ -314,6 +340,32 @@ export class RateLimiter { bucket.lastRefill = now; } + private getBucketRetryAfterMs(bucket: SiteBucket, now: number): number { + let retryAfterMs = 0; + + if (now < bucket.backoffUntil) { + retryAfterMs = Math.max(retryAfterMs, bucket.backoffUntil - now); + } + + if (bucket.tokens < 1) { + retryAfterMs = Math.max( + retryAfterMs, + Math.ceil((1 - bucket.tokens) / bucket.refillRate * 1000), + ); + } + + return retryAfterMs; + } + + private getMinGapRetryAfterMs(bucket: SiteBucket, now: number, minGapMs: number): number { + if (minGapMs <= 0 || bucket.lastGrantedAt === 0) { + return 0; + } + + const nextAllowedAt = bucket.lastGrantedAt + minGapMs; + return nextAllowedAt > now ? nextAllowedAt - now : 0; + } + private parseRetryAfter(headers: Record): number | null { const value = this.getHeaderCaseInsensitive(headers, 'retry-after'); if (!value) return null; diff --git a/src/browser/agent-browser-backend.ts b/src/browser/agent-browser-backend.ts index be82905..6c7c8f3 100644 --- a/src/browser/agent-browser-backend.ts +++ b/src/browser/agent-browser-backend.ts @@ -6,7 +6,12 @@ import type { BrowserProvider, SchruteConfig } from '../skill/types.js'; import type { BrowserAuthStore } from './auth-store.js'; import type { AuthCoordinator } from './auth-coordinator.js'; import type { AgentBrowserProvider } from './agent-browser-provider.js'; -import { AgentBrowserIpcClient, resolveSocketDir } from './agent-browser-ipc.js'; +import { AgentBrowserIpcClient } from './agent-browser-ipc.js'; +import { + closeAgentBrowserSession, + removeAgentBrowserSessionMetadata, + writeAgentBrowserSessionMetadata, +} from './agent-browser-cleanup.js'; const log = getLogger(); @@ -17,7 +22,12 @@ const PROBE_COOLDOWN_MS = 60_000; * Uses once-promise probe to detect availability. */ export class AgentBrowserBackend implements BrowserBackend { - private sessions = new Map(); + private sessions = new Map(); private daemonAvailable: boolean | null = null; private probePromise: Promise | null = null; private lastFailTime = 0; @@ -57,6 +67,7 @@ export class AgentBrowserBackend implements BrowserBackend { const existingEntry = this.sessions.get(siteId); if (existingEntry) { + existingEntry.lastUsedAt = Date.now(); // Stale-auth safety net: check if our cached session has outdated auth if (this.authCoordinator && this.authStore) { const participantId = `exec-ab:${siteId}`; @@ -76,15 +87,22 @@ export class AgentBrowserBackend implements BrowserBackend { } } + const sessionName = `exec-${siteId.replace(/[^a-zA-Z0-9_-]/g, '_')}`; + let ipc: AgentBrowserIpcClient | undefined; try { const { AgentBrowserProvider: ABProvider } = await import('./agent-browser-provider.js'); - const sessionName = `exec-${siteId.replace(/[^a-zA-Z0-9_-]/g, '_')}`; - const ipc = new AgentBrowserIpcClient(); + ipc = new AgentBrowserIpcClient(); + writeAgentBrowserSessionMetadata(this.config, { + sessionName, + siteId, + createdAt: Date.now(), + purpose: 'exec', + }); await ipc.bootstrapDaemon(sessionName); await ipc.connect(sessionName); - const provider = new ABProvider(ipc, domains); + const provider = new ABProvider(ipc, domains, () => this.touchSession(siteId)); // Hydrate with auth state if available (cookies only for agent-browser) if (this.authStore) { @@ -94,7 +112,7 @@ export class AgentBrowserBackend implements BrowserBackend { } } - this.sessions.set(siteId, { provider, ipc }); + this.sessions.set(siteId, { provider, ipc, sessionName, lastUsedAt: Date.now() }); // Register as auth coordinator participant if (this.authCoordinator) { @@ -114,6 +132,7 @@ export class AgentBrowserBackend implements BrowserBackend { return provider; } catch (err) { + await this.bestEffortCloseBootstrapSession(sessionName, ipc); log.warn({ err, siteId }, 'AgentBrowserBackend.createProvider failed'); return undefined; } @@ -122,12 +141,14 @@ export class AgentBrowserBackend implements BrowserBackend { async getCookies(siteId: string): Promise { const entry = this.sessions.get(siteId); if (!entry) return []; + entry.lastUsedAt = Date.now(); return entry.provider.getCookies(); } async setCookies(siteId: string, cookies: CookieEntry[]): Promise { const entry = this.sessions.get(siteId); if (entry) { + entry.lastUsedAt = Date.now(); await entry.provider.hydrateCookies(cookies); } } @@ -189,8 +210,7 @@ export class AgentBrowserBackend implements BrowserBackend { // Unregister from coordinator before closing this.authCoordinator?.unregister(`exec-ab:${siteId}`); - await entry.provider.close(); - entry.ipc.close(); + await this.closeTrackedSession(entry); this.sessions.delete(siteId); } @@ -198,8 +218,7 @@ export class AgentBrowserBackend implements BrowserBackend { const entry = this.sessions.get(siteId); if (entry) { this.authCoordinator?.unregister(`exec-ab:${siteId}`); - await entry.provider.close(); - entry.ipc.close(); + await this.closeTrackedSession(entry); this.sessions.delete(siteId); } } @@ -215,14 +234,28 @@ export class AgentBrowserBackend implements BrowserBackend { } const closePromises = [...this.sessions.values()].map(async (entry) => { try { - await entry.provider.close(); - entry.ipc.close(); + await this.closeTrackedSession(entry); } catch (err) { log.debug({ err }, 'Session close failed during shutdown'); } }); await Promise.allSettled(closePromises); this.sessions.clear(); } + async sweepIdleSessions(idleTimeoutMs: number): Promise { + if (idleTimeoutMs <= 0) return; + + const now = Date.now(); + const idleEntries = [...this.sessions.entries()] + .filter(([, entry]) => now - entry.lastUsedAt > idleTimeoutMs); + + await Promise.allSettled(idleEntries.map(async ([siteId, entry]) => { + this.authCoordinator?.unregister(`exec-ab:${siteId}`); + await this.closeTrackedSession(entry); + this.sessions.delete(siteId); + log.debug({ siteId, sessionName: entry.sessionName, idleForMs: now - entry.lastUsedAt }, 'Swept idle agent-browser session'); + })); + } + /** * Reset the probe cache for testing. */ @@ -243,6 +276,12 @@ export class AgentBrowserBackend implements BrowserBackend { const sessionName = `__prefetch_${siteId.replace(/[^a-zA-Z0-9_-]/g, '_')}_${Date.now()}`; const ipc = new AgentBrowserIpcClient(); + writeAgentBrowserSessionMetadata(this.config, { + sessionName, + siteId, + createdAt: Date.now(), + purpose: 'prefetch', + }); try { await ipc.bootstrapDaemon(sessionName); @@ -271,15 +310,14 @@ export class AgentBrowserBackend implements BrowserBackend { const cookies: CookieEntry[] = Array.isArray(result) ? result : []; // Tear down ephemeral session - try { await ipc.send({ action: 'close' }); } catch (err) { log.debug({ err }, 'IPC close send failed'); } - ipc.close(); + await this.closeEphemeralSession(sessionName, ipc); return cookies; } catch (err) { // Clean up IPC client but re-throw — callers (Promise.allSettled in // prefetchStaleAuth) must distinguish failure from empty cookie jar // to avoid wiping canonical auth state. - try { ipc.close(); } catch (err2) { log.debug({ err: err2 }, 'IPC cleanup failed'); } + await this.closeEphemeralSession(sessionName, ipc); throw err; } } @@ -360,6 +398,11 @@ export class AgentBrowserBackend implements BrowserBackend { // Step 2: Bootstrap probe session const probeName = `__probe_${process.pid}_${Date.now()}__`; const ipc = new AgentBrowserIpcClient(); + writeAgentBrowserSessionMetadata(this.config, { + sessionName: probeName, + createdAt: Date.now(), + purpose: 'probe', + }); try { await ipc.bootstrapDaemon(probeName); await ipc.connect(probeName); @@ -368,10 +411,9 @@ export class AgentBrowserBackend implements BrowserBackend { await ipc.send({ action: 'url' }); // Step 4: Tear down probe - await ipc.send({ action: 'close' }); - ipc.close(); + await this.closeEphemeralSession(probeName, ipc); } catch (err) { - ipc.close(); + await this.closeEphemeralSession(probeName, ipc); throw err; } @@ -384,4 +426,81 @@ export class AgentBrowserBackend implements BrowserBackend { return false; } } + + private async closeTrackedSession( + entry: { provider: AgentBrowserProvider; ipc: AgentBrowserIpcClient; sessionName: string; lastUsedAt: number }, + ): Promise { + let closedRemotely = false; + try { + await entry.provider.close(); + closedRemotely = true; + } catch (err) { + log.debug({ err, sessionName: entry.sessionName }, 'Tracked agent-browser provider close failed'); + } + + try { + entry.ipc.close(); + } catch (err) { + log.debug({ err, sessionName: entry.sessionName }, 'Tracked agent-browser IPC cleanup failed'); + } + + if (!closedRemotely) { + try { + await closeAgentBrowserSession(entry.sessionName); + } catch (err) { + log.debug({ err, sessionName: entry.sessionName }, 'Fallback agent-browser close failed'); + } + } + + removeAgentBrowserSessionMetadata(this.config, entry.sessionName); + } + + private touchSession(siteId: string): void { + const entry = this.sessions.get(siteId); + if (entry) { + entry.lastUsedAt = Date.now(); + } + } + + private async closeEphemeralSession(sessionName: string, ipc: AgentBrowserIpcClient): Promise { + let closedRemotely = false; + if (ipc.isConnected()) { + try { + await ipc.send({ action: 'close' }); + closedRemotely = true; + } catch (err) { + log.debug({ err, sessionName }, 'Ephemeral agent-browser close over IPC failed'); + } + } + + try { + ipc.close(); + } catch (err) { + log.debug({ err, sessionName }, 'Ephemeral agent-browser IPC cleanup failed'); + } + + if (!closedRemotely) { + try { + await closeAgentBrowserSession(sessionName); + } catch (err) { + log.debug({ err, sessionName }, 'Fallback agent-browser close failed'); + } + } + + removeAgentBrowserSessionMetadata(this.config, sessionName); + } + + private async bestEffortCloseBootstrapSession(sessionName: string, ipc?: AgentBrowserIpcClient): Promise { + try { + ipc?.close(); + } catch (err) { + log.debug({ err, sessionName }, 'Bootstrap IPC cleanup failed'); + } + try { + await closeAgentBrowserSession(sessionName); + } catch (err) { + log.debug({ err, sessionName }, 'Bootstrap cleanup close failed'); + } + removeAgentBrowserSessionMetadata(this.config, sessionName); + } } diff --git a/src/browser/agent-browser-cleanup.ts b/src/browser/agent-browser-cleanup.ts new file mode 100644 index 0000000..503973b --- /dev/null +++ b/src/browser/agent-browser-cleanup.ts @@ -0,0 +1,114 @@ +import { execFile, execFileSync } from 'node:child_process'; +import * as fs from 'node:fs'; +import * as path from 'node:path'; +import { getBrowserDataDir } from '../core/config.js'; +import type { SchruteConfig } from '../skill/types.js'; + +const AGENT_BROWSER_SESSION_DIR = 'agent-browser-sessions'; +const AGENT_BROWSER_CLOSE_TIMEOUT_MS = 10_000; + +export interface AgentBrowserSessionMetadata { + sessionName: string; + createdAt: number; + siteId?: string; + purpose?: 'exec' | 'prefetch' | 'probe'; +} + +export function getAgentBrowserSessionRoot(config: SchruteConfig): string { + return path.join(getBrowserDataDir(config), AGENT_BROWSER_SESSION_DIR); +} + +function getAgentBrowserSessionPath(config: SchruteConfig, sessionName: string): string { + return path.join(getAgentBrowserSessionRoot(config), `${encodeURIComponent(sessionName)}.json`); +} + +export function writeAgentBrowserSessionMetadata( + config: SchruteConfig, + metadata: AgentBrowserSessionMetadata, +): void { + const root = getAgentBrowserSessionRoot(config); + fs.mkdirSync(root, { recursive: true, mode: 0o700 }); + fs.writeFileSync(getAgentBrowserSessionPath(config, metadata.sessionName), JSON.stringify(metadata), { + mode: 0o600, + }); +} + +export function removeAgentBrowserSessionMetadata(config: SchruteConfig, sessionName: string): void { + try { + fs.rmSync(getAgentBrowserSessionPath(config, sessionName), { force: true }); + } catch { + // Best effort. + } +} + +export function listAgentBrowserSessionMetadata(config: SchruteConfig): AgentBrowserSessionMetadata[] { + const root = getAgentBrowserSessionRoot(config); + if (!fs.existsSync(root)) return []; + + const entries: AgentBrowserSessionMetadata[] = []; + for (const entry of fs.readdirSync(root, { withFileTypes: true })) { + if (!entry.isFile() || !entry.name.endsWith('.json')) continue; + try { + const raw = JSON.parse(fs.readFileSync(path.join(root, entry.name), 'utf-8')) as Partial; + if (typeof raw.sessionName !== 'string' || !raw.sessionName) continue; + entries.push({ + sessionName: raw.sessionName, + createdAt: typeof raw.createdAt === 'number' ? raw.createdAt : Date.now(), + ...(typeof raw.siteId === 'string' ? { siteId: raw.siteId } : {}), + ...(raw.purpose === 'exec' || raw.purpose === 'prefetch' || raw.purpose === 'probe' + ? { purpose: raw.purpose } + : {}), + }); + } catch { + // Skip malformed metadata. + } + } + return entries; +} + +export async function closeAgentBrowserSession(sessionName: string): Promise { + await new Promise((resolve, reject) => { + execFile( + 'agent-browser', + ['--session', sessionName, '--json', 'close'], + { timeout: AGENT_BROWSER_CLOSE_TIMEOUT_MS }, + (err) => { + if (err) { + reject(new Error(`Failed to close agent-browser session '${sessionName}': ${err.message}`)); + } else { + resolve(); + } + }, + ); + }); +} + +export function closeAgentBrowserSessionSync(sessionName: string): void { + execFileSync( + 'agent-browser', + ['--session', sessionName, '--json', 'close'], + { timeout: AGENT_BROWSER_CLOSE_TIMEOUT_MS, stdio: 'ignore' }, + ); +} + +export async function cleanupAgentBrowserSessions(config: SchruteConfig): Promise { + for (const metadata of listAgentBrowserSessionMetadata(config)) { + try { + await closeAgentBrowserSession(metadata.sessionName); + } catch { + // Best effort. Stale metadata should still be removed. + } + removeAgentBrowserSessionMetadata(config, metadata.sessionName); + } +} + +export function cleanupAgentBrowserSessionsSync(config: SchruteConfig): void { + for (const metadata of listAgentBrowserSessionMetadata(config)) { + try { + closeAgentBrowserSessionSync(metadata.sessionName); + } catch { + // Best effort. + } + removeAgentBrowserSessionMetadata(config, metadata.sessionName); + } +} diff --git a/src/browser/agent-browser-provider.ts b/src/browser/agent-browser-provider.ts index ac75c0b..a120a0b 100644 --- a/src/browser/agent-browser-provider.ts +++ b/src/browser/agent-browser-provider.ts @@ -10,6 +10,7 @@ import type { } from '../skill/types.js'; import type { CookieEntry } from './backend.js'; import type { AgentBrowserIpcClient } from './agent-browser-ipc.js'; +import { isCloudflareChallengeSignal } from '../shared/cloudflare-challenge.js'; const log = getLogger(); @@ -23,9 +24,11 @@ export class AgentBrowserProvider implements BrowserProvider { constructor( private ipc: AgentBrowserIpcClient, private allowedDomains: string[], + private onActivity?: () => void, ) {} async navigate(url: string): Promise { + this.onActivity?.(); await this.ipc.send({ action: 'navigate', url }); // Refresh cached URL from daemon const urlResp = await this.ipc.send({ action: 'url' }) as { url?: string } | string; @@ -33,6 +36,7 @@ export class AgentBrowserProvider implements BrowserProvider { } async snapshot(): Promise { + this.onActivity?.(); const result = await this.ipc.send({ action: 'snapshot', interactive: true }) as { snapshot?: string; refs?: object; @@ -49,6 +53,7 @@ export class AgentBrowserProvider implements BrowserProvider { } async click(ref: string): Promise { + this.onActivity?.(); await this.ipc.send({ action: 'click', selector: ref }); // Refresh cached URL (navigation may have occurred) const urlResp = await this.ipc.send({ action: 'url' }) as { url?: string } | string; @@ -56,10 +61,12 @@ export class AgentBrowserProvider implements BrowserProvider { } async type(ref: string, text: string): Promise { + this.onActivity?.(); await this.ipc.send({ action: 'fill', selector: ref, value: text }); } async evaluateFetch(req: SealedFetchRequest): Promise { + this.onActivity?.(); // Domain check before executing in browser try { const url = new URL(req.url); @@ -98,6 +105,7 @@ export class AgentBrowserProvider implements BrowserProvider { } async evaluateModelContext(req: SealedModelContextRequest): Promise { + this.onActivity?.(); const script = `await (async () => { const mc = navigator.modelContext; if (!mc || typeof mc.callTool !== 'function') { @@ -120,6 +128,7 @@ export class AgentBrowserProvider implements BrowserProvider { } async listModelContextTools(): Promise { + this.onActivity?.(); const script = `await (async () => { const mc = navigator.modelContext; if (!mc) return JSON.stringify({ result: null, error: 'WebMCP not available' }); @@ -168,6 +177,7 @@ export class AgentBrowserProvider implements BrowserProvider { } async screenshot(): Promise { + this.onActivity?.(); const result = await this.ipc.send({ action: 'screenshot', format: 'png' }) as { base64?: string; }; @@ -175,16 +185,30 @@ export class AgentBrowserProvider implements BrowserProvider { } async networkRequests(): Promise { + this.onActivity?.(); const result = await this.ipc.send({ action: 'network_requests' }); if (Array.isArray(result)) return result; return []; } + async detectChallengePage(): Promise { + try { + const snapshot = await this.snapshot(); + return isCloudflareChallengeSignal({ + url: this.currentUrl, + content: snapshot.content ?? '', + }); + } catch { + return false; + } + } + getCurrentUrl(): string { return this.currentUrl; } async getCookies(): Promise { + this.onActivity?.(); // Let IPC failures propagate — callers must distinguish read-failure // from genuine empty cookie jar to avoid wiping canonical auth state. const result = await this.ipc.send({ action: 'cookies_get' }); @@ -193,6 +217,7 @@ export class AgentBrowserProvider implements BrowserProvider { } async hydrateCookies(cookies: CookieEntry[]): Promise { + this.onActivity?.(); // Batch entire array in ONE command await this.ipc.send({ action: 'cookies_set', cookies }); } diff --git a/src/browser/base-browser-adapter.ts b/src/browser/base-browser-adapter.ts index bb650fe..c4be114 100644 --- a/src/browser/base-browser-adapter.ts +++ b/src/browser/base-browser-adapter.ts @@ -41,11 +41,13 @@ import { resizeScreenshotBuffer, estimateScale, DEFAULT_MAX_DIMENSION, DEFAULT_M import { humanMousePreamble } from './human-input.js'; import { isObviousNoise, shouldCaptureResponseBody } from '../capture/noise-filter.js'; import { getLogger } from '../core/logger.js'; +import { isCloudflareChallengeSignal } from '../shared/cloudflare-challenge.js'; const log = getLogger(); -// Cloudflare page title patterns — used for both challenge detection and snapshot warnings -const CF_CHALLENGE_TITLE_RE = /^Just a moment\b|Attention Required!.*Cloudflare|Verify you are human/i; +// Cloudflare page title patterns — generic titles require additional corroboration, +// while explicit Cloudflare-branded titles remain sufficient on their own. +const CF_EXPLICIT_CHALLENGE_TITLE_RE = /Attention Required!.*Cloudflare/i; const CF_PHISHING_TITLE_RE = /Suspected phishing|phishing site.*Cloudflare/i; // Browser-context globals used in page.evaluate/waitForFunction callbacks. @@ -66,19 +68,28 @@ export async function isCloudflareChallengePage(page: Page): Promise { let challengeElements: boolean; let title: string; + let content: string; try { - [challengeElements, title] = await Promise.all([ + [challengeElements, title, content] = await Promise.all([ page.evaluate((selectors: string[]) => { return selectors.some(sel => document.querySelector(sel) !== null); }, indicators), page.title(), + page.content(), ]); } catch { // Page context destroyed or closed — safe to return false return false; } - return challengeElements || CF_CHALLENGE_TITLE_RE.test(title) || CF_PHISHING_TITLE_RE.test(title); + if (challengeElements || CF_PHISHING_TITLE_RE.test(title) || CF_EXPLICIT_CHALLENGE_TITLE_RE.test(title)) { + return true; + } + + return isCloudflareChallengeSignal({ + url: page.url(), + content: `${title}\n${content}`, + }); } /** @@ -96,12 +107,14 @@ export async function detectAndWaitForChallenge(page: Page, timeoutMs = 15000): let challengeElements: boolean; let title: string; + let content: string; try { - [challengeElements, title] = await Promise.all([ + [challengeElements, title, content] = await Promise.all([ page.evaluate((selectors: string[]) => { return selectors.some(sel => document.querySelector(sel) !== null); }, indicators), page.title(), + page.content(), ]); } catch { // Page context destroyed or closed — safe to return false @@ -121,7 +134,11 @@ export async function detectAndWaitForChallenge(page: Page, timeoutMs = 15000): return false; } - const titleMatch = CF_CHALLENGE_TITLE_RE.test(title); + const titleMatch = CF_EXPLICIT_CHALLENGE_TITLE_RE.test(title) + || isCloudflareChallengeSignal({ + url: page.url(), + content: `${title}\n${content}`, + }); if (!challengeElements && !titleMatch) return false; @@ -924,7 +941,11 @@ export abstract class BaseBrowserAdapter implements BrowserProvider { case 'browser_fill_form': { const values = args.values; if (typeof values !== 'object' || values === null || Array.isArray(values)) { - throw new Error('Expected values to be a Record'); + throw new Error( + 'browser_fill_form expects { values: { "": "value", ... } }. ' + + 'Keys can be field labels, input name attributes, or @e refs from browser_snapshot. ' + + 'For single-field input, use browser_type instead.' + ); } await this.fillForm(values as Record); return { success: true }; @@ -1237,7 +1258,11 @@ export abstract class BaseBrowserAdapter implements BrowserProvider { } // Cloudflare challenge / interstitial page detection - const isChallenged = CF_CHALLENGE_TITLE_RE.test(title); + const isChallenged = CF_EXPLICIT_CHALLENGE_TITLE_RE.test(title) + || isCloudflareChallengeSignal({ + url, + content: `${title}\n${content}`, + }); const isPhishingPage = CF_PHISHING_TITLE_RE.test(title); if (isChallenged || isPhishingPage) { const currentEngine = this.capabilities?.effectiveEngine ?? 'unknown'; @@ -1432,6 +1457,14 @@ export abstract class BaseBrowserAdapter implements BrowserProvider { return this.page.url(); } + async detectChallengePage(): Promise { + const url = this.page.url(); + if (/\/cdn-cgi\//i.test(url)) { + return true; + } + return isCloudflareChallengePage(this.page); + } + private async captureScreenshot(options?: { format?: 'jpeg' | 'png'; quality?: number; diff --git a/src/browser/manager.ts b/src/browser/manager.ts index 240efc4..d838a54 100644 --- a/src/browser/manager.ts +++ b/src/browser/manager.ts @@ -17,6 +17,10 @@ import type { AuthCoordinator } from './auth-coordinator.js'; import { ParallelismGovernor } from './parallelism-governor.js'; import { NetworkRingBuffer } from '../capture/network-ring-buffer.js'; import { withTimeout } from '../core/utils.js'; +import { + writeOwnedBrowserLaunchMetadata, + removeOwnedBrowserLaunchMetadata, +} from './real-browser-handoff.js'; const log = getLogger(); @@ -145,6 +149,7 @@ export class BrowserManager { private reconnectAborted: boolean = false; private lastCdpSiteId: string | null = null; // preserved across disconnect for reconnect private networkRingBuffer?: NetworkRingBuffer; + private ownedLaunchPid: number | null = null; constructor(config?: SchruteConfig, pool?: BrowserPool) { this.config = config; @@ -222,6 +227,30 @@ export class BrowserManager { return this.config ?? getConfig(); } + private trackOwnedBrowserLaunch(browser: Browser): void { + if (this.pool || this.cdpConnected) return; + const launched = (browser as Browser & { + process?: () => { pid?: number; spawnfile?: string } | null; + }).process?.(); + const pid = launched?.pid; + if (!pid || !Number.isInteger(pid)) return; + + this.ownedLaunchPid = pid; + writeOwnedBrowserLaunchMetadata(this.getResolvedConfig(), { + pid, + createdAt: Date.now(), + engine: this.engine, + sessionName: this.sessionName, + commandHint: launched?.spawnfile, + }); + } + + private clearOwnedBrowserLaunch(): void { + if (!this.ownedLaunchPid) return; + removeOwnedBrowserLaunchMetadata(this.getResolvedConfig(), this.ownedLaunchPid); + this.ownedLaunchPid = null; + } + /** * Get the configured handler timeout in ms. */ @@ -396,8 +425,12 @@ export class BrowserManager { this.releaseToPool(); this.releaseToPool = null; this.disconnectHandler = null; + this.ownedLaunchPid = null; } else { - try { await browser?.close(); } catch { /* already closed */ } + try { + await browser?.close(); + this.clearOwnedBrowserLaunch(); + } catch { /* already closed */ } } }); } @@ -422,6 +455,7 @@ export class BrowserManager { this.disconnectHandler = () => { log.warn('Pool browser disconnected'); this.browser = null; + this.ownedLaunchPid = null; this.unregisterAllAuthParticipants(); this.contexts.clear(); if (this.idleTimer) { @@ -445,6 +479,7 @@ export class BrowserManager { } this.browser = result.browser; this.capabilities = result.capabilities; + this.trackOwnedBrowserLaunch(this.browser); if (result.capabilities.configuredEngine !== result.capabilities.effectiveEngine) { log.warn( { configured: result.capabilities.configuredEngine, effective: result.capabilities.effectiveEngine }, @@ -455,6 +490,7 @@ export class BrowserManager { this.browser.on('disconnected', () => { log.warn('Browser disconnected'); this.browser = null; + this.clearOwnedBrowserLaunch(); this.governor.release(); this.unregisterAllAuthParticipants(); this.contexts.clear(); @@ -905,9 +941,11 @@ export class BrowserManager { this.releaseToPool(); this.releaseToPool = null; this.disconnectHandler = null; + this.ownedLaunchPid = null; } else if (browser) { try { await browser.close(); + this.clearOwnedBrowserLaunch(); } catch (err) { log.warn({ err }, 'Error closing browser instance'); } @@ -920,6 +958,7 @@ export class BrowserManager { this.reconnecting = false; this.reconnectPromise = null; this.lastCdpSiteId = null; + this.ownedLaunchPid = null; log.info('Closed all browser contexts and browser'); }); diff --git a/src/browser/multi-session.ts b/src/browser/multi-session.ts index 09079f3..e1ef066 100644 --- a/src/browser/multi-session.ts +++ b/src/browser/multi-session.ts @@ -11,12 +11,18 @@ import { removeManagedChromeMetadata, terminateManagedChrome } from './real-brow const log = getLogger(); export const DEFAULT_SESSION_NAME = 'default'; +export type NamedSessionKind = + | 'launch' + | 'manual_cdp' + | 'recovery_explore_cdp' + | 'recovery_execute_cdp'; export interface NamedSession { name: string; siteId: string; browserManager: BrowserManager; isCdp: boolean; + sessionKind: NamedSessionKind; createdAt: number; lastUsedAt: number; ownedBy?: string; @@ -50,6 +56,7 @@ export class MultiSessionManager { siteId: '', browserManager: defaultBrowserManager, isCdp: false, + sessionKind: 'launch', createdAt: now, lastUsedAt: now, }); @@ -91,6 +98,7 @@ export class MultiSessionManager { siteId: '', browserManager: manager, isCdp: false, + sessionKind: 'launch', createdAt: now, lastUsedAt: now, }; @@ -107,6 +115,7 @@ export class MultiSessionManager { options: import('./cdp-connector.js').CdpConnectionOptions, siteId: string, ownedBy?: string, + sessionKind: NamedSessionKind = 'manual_cdp', ): Promise { if (name === DEFAULT_SESSION_NAME) { throw new Error('Cannot use "default" for CDP sessions. The default session is reserved for launch-based browser automation.'); @@ -129,6 +138,7 @@ export class MultiSessionManager { siteId, browserManager: manager, isCdp: true, + sessionKind, createdAt: now, lastUsedAt: now, ownedBy, @@ -147,22 +157,43 @@ export class MultiSessionManager { return session; } + /** + * Read a session without mutating its idle timestamp. + */ + peek(name: string): NamedSession | undefined { + return this.sessions.get(name); + } + /** * List sessions, optionally filtered by caller ownership. * In multi-user mode, non-admin callers see only their own named sessions * (default session is hidden as it contains the admin's browsing context). */ - list(callerId?: string, config?: SchruteConfig): NamedSession[] { + list( + callerId?: string, + config?: SchruteConfig, + options?: { includeInternal?: boolean }, + ): NamedSession[] { const all = [...this.sessions.values()]; - if (!callerId) return all; // admin/legacy — see everything + const visible = options?.includeInternal === false + ? all.filter(session => session.sessionKind !== 'recovery_execute_cdp') + : all; + if (!callerId) return visible; // admin/legacy — see everything const effectiveConfig = config ?? this.config; - if (!effectiveConfig || isAdminCaller(callerId, effectiveConfig)) return all; // admin — see everything + if (!effectiveConfig || isAdminCaller(callerId, effectiveConfig)) return visible; // admin — see everything // Non-admin in multi-user mode: hide default + other callers' sessions - return all.filter(s => + return visible.filter(s => s.name !== DEFAULT_SESSION_NAME && (!s.ownedBy || s.ownedBy === callerId) ); } + updateSessionKind(name: string, sessionKind: NamedSessionKind): void { + const session = this.sessions.get(name); + if (session) { + session.sessionKind = sessionKind; + } + } + /** * Close a named session. */ diff --git a/src/browser/real-browser-handoff.ts b/src/browser/real-browser-handoff.ts index 0765c15..b6c7590 100644 --- a/src/browser/real-browser-handoff.ts +++ b/src/browser/real-browser-handoff.ts @@ -6,6 +6,15 @@ import type { SchruteConfig } from '../skill/types.js'; const CHROME_PID_FILE = 'chrome.pid'; const CHROME_META_FILE = 'chrome.meta.json'; +const OWNED_LAUNCH_DIR = 'owned-launches'; + +export interface OwnedBrowserLaunchMetadata { + pid: number; + createdAt: number; + engine?: string; + sessionName?: string; + commandHint?: string; +} export interface ManagedChromeMetadata { pid?: number; @@ -141,6 +150,55 @@ export function removeManagedChromeMetadata(profileDir: string): void { } } +export function getOwnedBrowserLaunchRoot(config: SchruteConfig): string { + return path.join(getBrowserDataDir(config), OWNED_LAUNCH_DIR); +} + +function getOwnedBrowserLaunchPath(config: SchruteConfig, pid: number): string { + return path.join(getOwnedBrowserLaunchRoot(config), `${pid}.json`); +} + +export function writeOwnedBrowserLaunchMetadata( + config: SchruteConfig, + metadata: OwnedBrowserLaunchMetadata, +): void { + const root = getOwnedBrowserLaunchRoot(config); + fs.mkdirSync(root, { recursive: true, mode: 0o700 }); + fs.writeFileSync(getOwnedBrowserLaunchPath(config, metadata.pid), JSON.stringify(metadata), { mode: 0o600 }); +} + +export function removeOwnedBrowserLaunchMetadata(config: SchruteConfig, pid: number): void { + try { + fs.rmSync(getOwnedBrowserLaunchPath(config, pid), { force: true }); + } catch { + // Best effort. + } +} + +export function listOwnedBrowserLaunchMetadata(config: SchruteConfig): OwnedBrowserLaunchMetadata[] { + const root = getOwnedBrowserLaunchRoot(config); + if (!fs.existsSync(root)) return []; + + const entries: OwnedBrowserLaunchMetadata[] = []; + for (const entry of fs.readdirSync(root, { withFileTypes: true })) { + if (!entry.isFile() || !entry.name.endsWith('.json')) continue; + try { + const raw = JSON.parse(fs.readFileSync(path.join(root, entry.name), 'utf-8')) as Partial; + if (typeof raw.pid !== 'number' || !Number.isInteger(raw.pid)) continue; + entries.push({ + pid: raw.pid, + createdAt: typeof raw.createdAt === 'number' ? raw.createdAt : Date.now(), + ...(typeof raw.engine === 'string' ? { engine: raw.engine } : {}), + ...(typeof raw.sessionName === 'string' ? { sessionName: raw.sessionName } : {}), + ...(typeof raw.commandHint === 'string' ? { commandHint: raw.commandHint } : {}), + }); + } catch { + // Skip malformed metadata. + } + } + return entries; +} + export function isProcessAlive(pid: number): boolean { try { process.kill(pid, 0); @@ -165,26 +223,148 @@ function readProcessCommandLine(pid: number): string | null { } } -export async function terminateManagedChrome(pid: number): Promise { - if (!isProcessAlive(pid)) return; +function readProcessStartTimeMs(pid: number): number | null { try { - process.kill(pid, 'SIGTERM'); + if (process.platform === 'win32') { + const isoTimestamp = execFileSync('powershell', [ + '-NoProfile', + '-Command', + `$p = Get-Process -Id ${pid} -ErrorAction Stop; $p.StartTime.ToUniversalTime().ToString('o')`, + ], { encoding: 'utf-8' }).trim(); + const parsed = Date.parse(isoTimestamp); + return Number.isFinite(parsed) ? parsed : null; + } + + const raw = execFileSync('ps', ['-p', String(pid), '-o', 'lstart='], { encoding: 'utf-8' }) + .trim() + .replace(/\s+/g, ' '); + const parsed = Date.parse(raw); + return Number.isFinite(parsed) ? parsed : null; } catch { - return; + return null; + } +} + +const OWNED_LAUNCH_START_TIME_TOLERANCE_MS = 120_000; + +function canRevalidateOwnedBrowserLaunch(metadata: OwnedBrowserLaunchMetadata): boolean { + const startTimeMs = readProcessStartTimeMs(metadata.pid); + if (startTimeMs == null) { + return false; + } + + if (Math.abs(startTimeMs - metadata.createdAt) > OWNED_LAUNCH_START_TIME_TOLERANCE_MS) { + return false; + } + + const commandLine = readProcessCommandLine(metadata.pid); + if (metadata.commandHint && commandLine && !commandLine.includes(metadata.commandHint)) { + return false; + } + + return true; +} + +function listUnixDescendantPids(pid: number): number[] { + try { + const rows = execFileSync('ps', ['-axo', 'pid=,ppid='], { encoding: 'utf-8' }) + .trim() + .split(/\r?\n/) + .map(line => line.trim().split(/\s+/, 2).map(Number)) + .filter(parts => parts.length === 2 && Number.isInteger(parts[0]) && Number.isInteger(parts[1])) + .map(([childPid, parentPid]) => ({ childPid, parentPid })); + + const byParent = new Map(); + for (const row of rows) { + const children = byParent.get(row.parentPid) ?? []; + children.push(row.childPid); + byParent.set(row.parentPid, children); + } + + const descendants: number[] = []; + const stack = [...(byParent.get(pid) ?? [])]; + while (stack.length > 0) { + const current = stack.pop()!; + descendants.push(current); + const children = byParent.get(current); + if (children) stack.push(...children); + } + return descendants; + } catch { + return []; + } +} + +function signalUnixTree(pid: number, signal: NodeJS.Signals, detachedProcessGroup: boolean): void { + if (detachedProcessGroup) { + try { + process.kill(-pid, signal); + return; + } catch { + // Fall back to explicit child traversal. + } } - const deadline = Date.now() + 1000; + const descendants = listUnixDescendantPids(pid); + for (const childPid of descendants.reverse()) { + try { process.kill(childPid, signal); } catch { /* best effort */ } + } + try { process.kill(pid, signal); } catch { /* best effort */ } +} + +async function waitForProcessExit(pid: number, timeoutMs: number): Promise { + const deadline = Date.now() + timeoutMs; while (Date.now() < deadline) { - if (!isProcessAlive(pid)) return; + if (!isProcessAlive(pid)) return true; await new Promise(resolve => setTimeout(resolve, 50)); } + return !isProcessAlive(pid); +} +export async function terminateProcessTree( + pid: number, + options?: { detachedProcessGroup?: boolean }, +): Promise { if (!isProcessAlive(pid)) return; - try { - process.kill(pid, 'SIGKILL'); - } catch { - // Best effort. + if (process.platform === 'win32') { + try { + execFileSync('taskkill', ['/T', '/F', '/PID', String(pid)], { stdio: 'ignore' }); + } catch { + // Best effort. + } + return; + } + + signalUnixTree(pid, 'SIGTERM', options?.detachedProcessGroup === true); + if (await waitForProcessExit(pid, 1000)) return; + signalUnixTree(pid, 'SIGKILL', options?.detachedProcessGroup === true); +} + +export function terminateProcessTreeSync( + pid: number, + options?: { detachedProcessGroup?: boolean }, +): void { + if (!isProcessAlive(pid)) return; + + if (process.platform === 'win32') { + try { + execFileSync('taskkill', ['/T', '/F', '/PID', String(pid)], { stdio: 'ignore' }); + } catch { + // Best effort. + } + return; } + + signalUnixTree(pid, 'SIGTERM', options?.detachedProcessGroup === true); + signalUnixTree(pid, 'SIGKILL', options?.detachedProcessGroup === true); +} + +export async function terminateManagedChrome(pid: number): Promise { + await terminateProcessTree(pid, { detachedProcessGroup: process.platform !== 'win32' }); +} + +export function terminateManagedChromeSync(pid: number): void { + terminateProcessTreeSync(pid, { detachedProcessGroup: process.platform !== 'win32' }); } export async function waitForDevToolsActivePort( @@ -317,9 +497,43 @@ export function cleanupManagedChromeLaunchesSync(config: SchruteConfig): void { removeManagedChromeMetadata(profileDir); continue; } - try { - process.kill(pid, 'SIGTERM'); - } catch { /* best-effort */ } + terminateManagedChromeSync(pid); removeManagedChromeMetadata(profileDir); } } + +export async function cleanupOwnedBrowserLaunches(config: SchruteConfig): Promise { + for (const metadata of listOwnedBrowserLaunchMetadata(config)) { + if (!isProcessAlive(metadata.pid)) { + removeOwnedBrowserLaunchMetadata(config, metadata.pid); + continue; + } + + // Never kill a reused PID unless we can positively re-establish ownership. + if (!canRevalidateOwnedBrowserLaunch(metadata)) { + removeOwnedBrowserLaunchMetadata(config, metadata.pid); + continue; + } + + await terminateProcessTree(metadata.pid); + removeOwnedBrowserLaunchMetadata(config, metadata.pid); + } +} + +export function cleanupOwnedBrowserLaunchesSync(config: SchruteConfig): void { + for (const metadata of listOwnedBrowserLaunchMetadata(config)) { + if (!isProcessAlive(metadata.pid)) { + removeOwnedBrowserLaunchMetadata(config, metadata.pid); + continue; + } + + // Exit-handler cleanup stays best-effort, but it still must prove ownership. + if (!canRevalidateOwnedBrowserLaunch(metadata)) { + removeOwnedBrowserLaunchMetadata(config, metadata.pid); + continue; + } + + terminateProcessTreeSync(metadata.pid); + removeOwnedBrowserLaunchMetadata(config, metadata.pid); + } +} diff --git a/src/capture/api-extractor.ts b/src/capture/api-extractor.ts index 5e866c1..c6e1243 100644 --- a/src/capture/api-extractor.ts +++ b/src/capture/api-extractor.ts @@ -11,6 +11,7 @@ export interface EndpointCluster { method: string; pathTemplate: string; canonicalHost: string; + responseContentType?: string; requests: StructuredRecord[]; commonHeaders: Record; commonQueryParams: string[]; @@ -87,11 +88,13 @@ export function clusterEndpoints(requests: StructuredRecord[], trie?: PathTrie): const commonHeaders = extractCommonHeaders(recs.map(r => r.request)); const commonQueryParams = extractCommonQueryParams(recs.map(r => r.request)); const bodyShape = inferBodyShape(recs.map(r => r.request)); + const responseContentType = extractResponseContentType(recs); clusters.push({ method, pathTemplate, canonicalHost, + responseContentType, requests: recs, commonHeaders, commonQueryParams, @@ -214,6 +217,25 @@ function extractCommonQueryParams(requests: StructuredRequest[]): string[] { ); } +function extractResponseContentType(records: StructuredRecord[]): string | undefined { + const counts = new Map(); + for (const record of records) { + const contentType = record.response.contentType; + if (!contentType) continue; + counts.set(contentType, (counts.get(contentType) ?? 0) + 1); + } + + let best: string | undefined; + let bestCount = 0; + for (const [contentType, count] of counts) { + if (count > bestCount) { + best = contentType; + bestCount = count; + } + } + return best; +} + // ─── Body Shape Inference ──────────────────────────────────────────── function inferBodyShape(requests: StructuredRequest[]): Record | undefined { diff --git a/src/capture/noise-filter.ts b/src/capture/noise-filter.ts index 5443ce2..2d57ca8 100644 --- a/src/capture/noise-filter.ts +++ b/src/capture/noise-filter.ts @@ -25,6 +25,7 @@ const STATIC_RESOURCE_TYPE_SET = buildStaticResourceTypeSet(); export interface FilterResult { signal: HarEntry[]; + htmlDocument: HarEntry[]; noise: HarEntry[]; ambiguous: HarEntry[]; } @@ -92,6 +93,7 @@ export function filterRequests( siteHost?: string, ): FilterResult { const signal: HarEntry[] = []; + const htmlDocument: HarEntry[] = []; const noise: HarEntry[] = []; const ambiguous: HarEntry[] = []; @@ -105,16 +107,22 @@ export function filterRequests( for (const entry of entries) { const { classification } = classifyEntry(entry, overrideMap, pollingUrls, siteHost); - const bucket = classification === 'noise' ? noise : classification === 'ambiguous' ? ambiguous : signal; + const bucket = classification === 'noise' + ? noise + : classification === 'ambiguous' + ? ambiguous + : classification === 'html_document' + ? htmlDocument + : signal; bucket.push(entry); } log.debug( - { signal: signal.length, noise: noise.length, ambiguous: ambiguous.length }, + { signal: signal.length, htmlDocument: htmlDocument.length, noise: noise.length, ambiguous: ambiguous.length }, 'Filtered requests', ); - return { signal, noise, ambiguous }; + return { signal, htmlDocument, noise, ambiguous }; } export function isObviousNoise( @@ -199,6 +207,7 @@ export function recordFilteredEntries( const pollingUrls = detectPollingPatterns(entries); const signal: HarEntry[] = []; + const htmlDocument: HarEntry[] = []; const noise: HarEntry[] = []; const ambiguous: HarEntry[] = []; @@ -221,11 +230,17 @@ export function recordFilteredEntries( log.warn({ frameId, requestHash, err }, 'Failed to insert action_frame_entry, skipping'); } - const bucket = classification === 'noise' ? noise : classification === 'ambiguous' ? ambiguous : signal; + const bucket = classification === 'noise' + ? noise + : classification === 'ambiguous' + ? ambiguous + : classification === 'html_document' + ? htmlDocument + : signal; bucket.push(entry); } - return { signal, noise, ambiguous }; + return { signal, htmlDocument, noise, ambiguous }; } // ─── Classification Logic ──────────────────────────────────────────── @@ -300,8 +315,12 @@ function classifyEntry( return { classification: 'noise', reason: 'polling' }; } + if (isHtmlDocumentEntry(entry, hostname, siteHost)) { + return { classification: 'html_document', reason: 'html_document' }; + } + // If we get here and the response is a non-API content type, mark ambiguous - const contentType = getResponseContentType(entry); + const contentType = (getResponseContentType(entry) ?? '').toLowerCase(); if (contentType && (contentType.includes('text/html') || contentType.includes('text/css'))) { return { classification: 'ambiguous', reason: 'non_api_content_type' }; } @@ -492,6 +511,24 @@ function getResponseContentType(entry: HarEntry): string | undefined { return header?.value; } +function isHtmlDocumentEntry(entry: HarEntry, hostname: string, siteHost?: string): boolean { + const method = entry.request.method.toUpperCase(); + if (method !== 'GET' && method !== 'HEAD') { + return false; + } + if (entry.response.status !== 200) { + return false; + } + if (!siteHost || !isLearnableHost(hostname, siteHost)) { + return false; + } + const contentType = (getResponseContentType(entry) ?? '').toLowerCase(); + if (!contentType.includes('text/html')) { + return false; + } + return (entry._resourceType ?? '').toLowerCase() === 'document'; +} + function hashEntry(entry: HarEntry): string { const content = `${entry.request.method}|${entry.request.url}|${entry.startedDateTime}`; return crypto.createHash('sha256').update(content).digest('hex').slice(0, 16); diff --git a/src/capture/pipeline-worker.ts b/src/capture/pipeline-worker.ts index 6dc500a..ac5d707 100644 --- a/src/capture/pipeline-worker.ts +++ b/src/capture/pipeline-worker.ts @@ -36,6 +36,8 @@ export interface PipelineWorkerOutput { auditData: { totalCount: number; signalCount: number; + htmlDocumentCount: number; + ambiguousCount?: number; noiseCount: number; dedupedCount: number; }; @@ -83,6 +85,8 @@ export async function runPipelineTask( auditData: { totalCount: auditEntries.length, signalCount: filtered.signalCount, + htmlDocumentCount: filtered.htmlDocumentCount, + ...(filtered.ambiguousCount > 0 ? { ambiguousCount: filtered.ambiguousCount } : {}), noiseCount: filtered.noiseCount, dedupedCount: filtered.signalCount - dedupedSignalRecords.length, }, @@ -123,10 +127,16 @@ function filterEntriesWithRecords( ): { signalRecords: StructuredRecord[]; signalCount: number; + htmlDocumentCount: number; + ambiguousCount: number; noiseCount: number; } { - const { signal, noise } = filterRequests(auditEntries as unknown as HarEntry[], [], siteId); - const signalSet = new Set(signal); + const filtered = filterRequests(auditEntries as unknown as HarEntry[], [], siteId); + const signal = filtered.signal ?? []; + const htmlDocument = filtered.htmlDocument ?? []; + const ambiguous = filtered.ambiguous ?? []; + const noise = filtered.noise ?? []; + const signalSet = new Set([...signal, ...htmlDocument]); const signalRecords: StructuredRecord[] = []; for (let index = 0; index < auditEntries.length; index++) { @@ -137,7 +147,9 @@ function filterEntriesWithRecords( return { signalRecords, - signalCount: signal.length, + signalCount: signal.length + htmlDocument.length, + htmlDocumentCount: htmlDocument.length, + ambiguousCount: ambiguous.length, noiseCount: noise.length, }; } diff --git a/src/client/typescript/types.ts b/src/client/typescript/types.ts index 7558084..967a7dd 100644 --- a/src/client/typescript/types.ts +++ b/src/client/typescript/types.ts @@ -37,12 +37,37 @@ export interface SkillSummary { currentTier: TierState; } -export interface ExecuteSkillResponse { +export interface ExecuteSkillCompletedResponse { success: boolean; data?: unknown; error?: string; + failureCause?: string; + failureDetail?: string; + latencyMs?: number; } +export interface ExecuteBrowserHandoffRequiredResponse { + status: 'browser_handoff_required'; + success: false; + reason: 'cloudflare_challenge'; + recoveryMode: 'real_browser_cdp'; + siteId: string; + url: string; + hint: string; + resumeToken?: string; + advisoryHint?: string; + session?: string; + managedBrowser?: boolean; + failureCause?: string; + failureDetail?: string; + latencyMs: number; +} + +export type ExecuteSkillResponse = + | ExecuteSkillCompletedResponse + | ConfirmationRequired + | ExecuteBrowserHandoffRequiredResponse; + export interface ConfirmationRequired { status: 'confirmation_required'; message: string; @@ -138,6 +163,8 @@ export interface StopResponse { export interface PipelineJobResult { skillsGenerated: number; signalCount: number; + htmlDocumentCount?: number; + ambiguousCount?: number; noiseCount: number; totalCount: number; warning?: string; diff --git a/src/core/config.ts b/src/core/config.ts index c7167ec..ebff273 100644 --- a/src/core/config.ts +++ b/src/core/config.ts @@ -71,6 +71,7 @@ const DEFAULT_CONFIG: SchruteConfig = { server: { network: false, httpPort: 3000, + mcpHttpAdmin: false, }, daemon: { port: 19420, @@ -422,6 +423,13 @@ export function loadConfig(configPath?: string): SchruteConfig { ); } + // Validate server.mcpHttpAdmin is boolean if present + if (loaded.server.mcpHttpAdmin !== undefined && typeof loaded.server.mcpHttpAdmin !== 'boolean') { + throw new Error( + `Invalid config: server.mcpHttpAdmin must be a boolean. Got: ${JSON.stringify(loaded.server.mcpHttpAdmin)}`, + ); + } + // Validate server.httpPort is a valid port number if present if (loaded.server.httpPort !== undefined) { if (typeof loaded.server.httpPort !== 'number' || !Number.isInteger(loaded.server.httpPort) || loaded.server.httpPort < 1 || loaded.server.httpPort > 65535) { diff --git a/src/core/engine.ts b/src/core/engine.ts index e4cfe4d..98aa378 100644 --- a/src/core/engine.ts +++ b/src/core/engine.ts @@ -33,17 +33,25 @@ import { getFlags } from '../browser/feature-flags.js'; import { BrowserManager, ContextOverrideMismatchError, stableStringify } from '../browser/manager.js'; import type { ContextOverrides } from '../browser/manager.js'; import { BrowserPool } from '../browser/pool.js'; -import { MultiSessionManager, DEFAULT_SESSION_NAME } from '../browser/multi-session.js'; +import { MultiSessionManager, DEFAULT_SESSION_NAME, type NamedSessionKind } from '../browser/multi-session.js'; import type { BrowserBackend } from '../browser/backend.js'; import { BrowserAuthStore } from '../browser/auth-store.js'; import { AuthCoordinator } from '../browser/auth-coordinator.js'; -import { AgentBrowserBackend } from '../browser/agent-browser-backend.js'; +import { + AgentBrowserBackend, +} from '../browser/agent-browser-backend.js'; +import { + cleanupAgentBrowserSessions, + cleanupAgentBrowserSessionsSync, +} from '../browser/agent-browser-cleanup.js'; import { PlaywrightBackend } from '../browser/playwright-backend.js'; import { LiveChromeBackend } from '../browser/live-chrome-backend.js'; import { BoundedMap } from '../shared/bounded-map.js'; import { cleanupManagedChromeLaunches, cleanupManagedChromeLaunchesSync, + cleanupOwnedBrowserLaunches, + cleanupOwnedBrowserLaunchesSync, listManagedChromeMetadata, launchManagedChrome, removeManagedChromeMetadata, @@ -63,7 +71,12 @@ import { classifySite } from '../automation/classifier.js'; import { updateStrategy } from '../automation/strategy.js'; import type { NetworkEntry } from '../skill/types.js'; import { canPromote, promoteSkill } from './promotion.js'; -import { handleFailure, getEffectiveTier, checkPromotion } from './tiering.js'; +import { + handleFailure, + getEffectiveTier, + checkPromotion, + sanitizeSiteRecommendedTier, +} from './tiering.js'; import { detectDrift } from '../healing/diff-engine.js'; import { monitorSkills, shouldNudge } from '../healing/monitor.js'; import { AmendmentEngine } from '../healing/amendment.js'; @@ -86,6 +99,8 @@ import { loadCachedTools } from '../discovery/webmcp-scanner.js'; import { SkillStatus } from '../skill/types.js'; import type { GeoEmulationConfig, PermanentTierLock, ExecutionTierName } from '../skill/types.js'; import { withTimeout } from './utils.js'; +import { applyTransform } from '../replay/transform.js'; +import { executeWorkflow, type WorkflowCacheEntry } from '../replay/workflow-executor.js'; // ─── Types ──────────────────────────────────────────────────────── @@ -109,6 +124,7 @@ export interface EngineStatus { promoted: number; skippedNoParams: number; skippedRateLimited: number; + skippedBrowserRequired: number; lastCycleProcessed: number; }; } @@ -139,6 +155,8 @@ export interface PipelineJob { result?: { skillsGenerated: number; signalCount: number; + htmlDocumentCount?: number; + ambiguousCount?: number; noiseCount: number; totalCount: number; warning?: string; @@ -190,6 +208,7 @@ export interface RecoverExploreResult { interface RecoveryState { resumeToken: string; recoveryId: string; + recoveryKind: 'explore' | 'execute'; siteId: string; url: string; hint: string; @@ -209,13 +228,65 @@ interface RecoveryState { export interface SkillExecutionResult { success: boolean; + status?: 'browser_handoff_required'; data?: unknown; + transformApplied?: boolean; + transformLabel?: string; error?: string; failureCause?: string; failureDetail?: string; + probeSuppressed?: boolean; + reason?: 'cloudflare_challenge'; + recoveryMode?: 'real_browser_cdp'; + siteId?: string; + url?: string; + hint?: string; + resumeToken?: string; + advisoryHint?: string; + session?: string; + managedBrowser?: boolean; latencyMs: number; } +export interface BatchSkillAction { + skillId: string; + params?: Record; +} + +export interface BatchSkillResult { + skillId: string; + success: boolean; + status?: SkillExecutionResult['status']; + data?: unknown; + transformApplied?: boolean; + transformLabel?: string; + error?: string; + failureCause?: string; + failureDetail?: string; + reason?: SkillExecutionResult['reason']; + recoveryMode?: SkillExecutionResult['recoveryMode']; + siteId?: string; + url?: string; + hint?: string; + resumeToken?: string; + advisoryHint?: string; + session?: string; + managedBrowser?: boolean; + latencyMs?: number; +} + +interface SkillExecutionOptions { + skipMetrics?: boolean; + forceDirectTier?: boolean; + skipTransform?: boolean; + waitForPermit?: { + timeoutMs?: number; + minGapMs?: number; + }; +} + +const WORKFLOW_STEP_PERMIT_TIMEOUT_MS = 30_000; + function formatExecutionError(cause: string, detail: string): string { return `Failure: ${cause} — ${detail.replace(/\.+$/, '')}. Use schrute_dry_run to preview.`; } @@ -339,6 +410,7 @@ export class Engine { promoted: 0, skippedNoParams: 0, skippedRateLimited: 0, + skippedBrowserRequired: 0, lastCycleProcessed: 0, }; private authStore: BrowserAuthStore; @@ -347,6 +419,11 @@ export class Engine { private fallbackExecutionBackend: PlaywrightBackend | null = null; private sharedPlaywrightBackends = new Map(); private liveChromeBackend?: LiveChromeBackend; + private executionBackendGroupIds = new WeakMap(); + private nextExecutionBackendGroupId = 0; + private workflowStepCache = new BoundedMap({ + maxSize: 1_000, + }); private pathTrie?: PathTrie; constructor(config: SchruteConfig) { @@ -414,23 +491,27 @@ export class Engine { cleanupManagedChromeLaunches(config).catch(err => { this.log.debug({ err }, 'Managed Chrome cleanup failed during startup'); }); + cleanupOwnedBrowserLaunches(config).catch(err => { + this.log.debug({ err }, 'Owned browser launch cleanup failed during startup'); + }); + cleanupAgentBrowserSessions(config).catch(err => { + this.log.debug({ err }, 'Agent-browser cleanup failed during startup'); + }); // Last-resort exit handler: send SIGTERM to any managed Chrome processes // when the MCP server exits unexpectedly (crash, uncaught exception, OOM). // process.on('exit') only allows synchronous code, so we use the sync variant. this.exitCleanupHandler = () => { try { cleanupManagedChromeLaunchesSync(config); } catch { /* best-effort */ } + try { cleanupOwnedBrowserLaunchesSync(config); } catch { /* best-effort */ } + try { cleanupAgentBrowserSessionsSync(config); } catch { /* best-effort */ } }; process.on('exit', this.exitCleanupHandler); // HMAC key init is deferred to first skill execution (lazy) to avoid // blocking constructor and leaking promises when no skills are executed. - // Session sweep: clean up idle named sessions every 15 minutes - this.sessionSweepInterval = setInterval(() => { - this.multiSessionManager.sweepIdleSessions(3600_000); - }, 900_000); - this.sessionSweepInterval.unref(); + this.startSessionSweep(); // WS-10: Background sweep for stale/broken skills (every 6 hours) this.sweepInterval = setInterval(() => { @@ -545,6 +626,85 @@ export class Engine { getAuthStore(): BrowserAuthStore { return this.authStore; } getAuthCoordinator(): AuthCoordinator { return this.authCoordinator; } + private getConfiguredBrowserIdleTimeoutMs(): number { + return this.config.browser?.idleTimeoutMs ?? 300_000; + } + + private getRecoveryIdleTimeoutMs(): number { + const configuredIdleMs = this.getConfiguredBrowserIdleTimeoutMs(); + return configuredIdleMs > 0 ? configuredIdleMs : 20 * 60 * 1000; + } + + private getExecuteRecoveryIdleTimeoutMs(): number { + const configuredIdleMs = this.getConfiguredBrowserIdleTimeoutMs(); + if (configuredIdleMs <= 0) { + return 60_000; + } + return Math.min(configuredIdleMs, 60_000); + } + + private getSessionSweepIntervalMs(): number { + const smallestIdleMs = Math.min( + this.getRecoveryIdleTimeoutMs(), + this.getExecuteRecoveryIdleTimeoutMs(), + ); + return Math.min(60_000, Math.max(15_000, Math.floor(smallestIdleMs / 2))); + } + + private startSessionSweep(): void { + const intervalMs = this.getSessionSweepIntervalMs(); + this.sessionSweepInterval = setInterval(() => { + this.sweepIdleNamedSessions().catch(err => { + this.log.warn({ err }, 'Idle session sweep failed'); + }); + }, intervalMs); + this.sessionSweepInterval.unref(); + } + + private async sweepIdleNamedSessions(): Promise { + if (this.isClosing) return; + + const now = Date.now(); + const recoveryIdleMs = this.getRecoveryIdleTimeoutMs(); + const executeRecoveryIdleMs = this.getExecuteRecoveryIdleTimeoutMs(); + const agentBrowserIdleMs = this.getConfiguredBrowserIdleTimeoutMs(); + const manualCdpIdleMs = 20 * 60 * 1000; + const launchIdleMs = 3600_000; + const sessions = this.multiSessionManager.list(undefined, this.config); + + for (const session of sessions) { + if (session.name === DEFAULT_SESSION_NAME) continue; + + const idleMs = now - session.lastUsedAt; + + if (session.sessionKind === 'recovery_execute_cdp') { + if (idleMs <= executeRecoveryIdleMs) continue; + } else if (session.sessionKind === 'recovery_explore_cdp') { + if (this.mode === 'exploring' || this.mode === 'recording') continue; + if (this.exploreSessionName === session.name) continue; + if (idleMs <= recoveryIdleMs) continue; + } else if (session.isCdp) { + if (idleMs <= manualCdpIdleMs) continue; + } else if (idleMs <= launchIdleMs) { + continue; + } + + try { + await this.multiSessionManager.close(session.name, { force: true, engineMode: this.mode }); + } catch (err) { + this.log.warn({ err, session: session.name }, 'Session sweep close failed'); + } + } + + if (agentBrowserIdleMs > 0) { + try { + await this.agentBrowserBackend.sweepIdleSessions(agentBrowserIdleMs); + } catch (err) { + this.log.warn({ err }, 'Agent-browser idle session sweep failed'); + } + } + } + private cleanupStaleRecoveryPolicies(): void { try { const db = getDatabase(this.config); @@ -646,6 +806,16 @@ export class Engine { return this.fallbackExecutionBackend; } + resolveExecutionGroupKey(skill: SkillSpec): string { + try { + const backend = this.getExecutionBackend(skill.siteId); + return `${skill.siteId}|${this.getOrCreateExecutionBackendGroupId(backend)}`; + } catch (err) { + const detail = err instanceof Error ? err.message : String(err); + return `${skill.siteId}|unresolved|${detail}`; + } + } + private getOrCreateSharedPlaywrightBackend(sessionName: string, manager: BrowserManager): PlaywrightBackend { let backend = this.sharedPlaywrightBackends.get(sessionName); if (!backend) { @@ -655,6 +825,16 @@ export class Engine { return backend; } + private getOrCreateExecutionBackendGroupId(backend: BrowserBackend): string { + const existing = this.executionBackendGroupIds.get(backend); + if (existing) { + return existing; + } + const created = `backend-${this.nextExecutionBackendGroupId++}`; + this.executionBackendGroupIds.set(backend, created); + return created; + } + drainWarnings(): string[] { const w = [...this.warnings]; this.warnings = []; @@ -702,10 +882,10 @@ export class Engine { return 'Cloudflare challenge detected. Call schrute_recover_explore to continue in real Chrome.'; } - private getRecoveryBySiteId(siteId: string): RecoveryState | undefined { + private getRecoveryBySiteId(siteId: string, recoveryKind: RecoveryState['recoveryKind']): RecoveryState | undefined { let match: RecoveryState | undefined; for (const entry of this.recoveries.values()) { - if (entry.siteId === siteId && entry.currentState !== 'ready') { + if (entry.siteId === siteId && entry.recoveryKind === recoveryKind && entry.currentState !== 'ready') { if (!match || entry.createdAt > match.createdAt) { match = entry; } @@ -734,48 +914,115 @@ export class Engine { }; } - private upsertPendingRecovery(siteId: string, url: string, overrides?: ContextOverrides): RecoveryState { - const existing = this.getRecoveryBySiteId(siteId); - const autoRecoverSupported = this.isAutomaticRecoverySupported(overrides); - const advisoryHint = this.getCloudflareAdvisoryHint(); - const hint = this.buildRecoveryHint(overrides); - if (existing) { - existing.url = url; - existing.hint = hint; - existing.overrides = overrides; - existing.currentState = 'pending'; - existing.failureReason = undefined; - existing.autoRecoverSupported = autoRecoverSupported; - existing.advisoryHint = advisoryHint; - this.recoveries.set(existing.resumeToken, existing); - return existing; - } - + private createRecoveryState( + siteId: string, + url: string, + overrides: ContextOverrides | undefined, + recoveryKind: RecoveryState['recoveryKind'], + ): RecoveryState { const recoveryId = randomUUID(); const resumeToken = randomUUID(); const cdpSessionName = `__recovery_${createHash('sha256').update(recoveryId).digest('hex').slice(0, 16)}`; const managedProfileDir = path.join(this.config.dataDir, 'browser-data', 'live-chrome', recoveryId); const createdAt = Date.now(); - const entry: RecoveryState = { + return { resumeToken, recoveryId, + recoveryKind, siteId, url, - hint, + hint: this.buildRecoveryHint(overrides), createdAt, cdpSessionName, managedProfileDir, managedBrowser: false, exploreSessionNameBeforeRecovery: this.exploreSessionName, currentState: 'pending', - advisoryHint, + advisoryHint: this.getCloudflareAdvisoryHint(), overrides, - autoRecoverSupported, + autoRecoverSupported: this.isAutomaticRecoverySupported(overrides), }; - this.recoveries.set(resumeToken, entry); + } + + private upsertPendingRecovery( + siteId: string, + url: string, + overrides?: ContextOverrides, + recoveryKind: RecoveryState['recoveryKind'] = 'explore', + ): RecoveryState { + const existing = this.getRecoveryBySiteId(siteId, recoveryKind); + const autoRecoverSupported = this.isAutomaticRecoverySupported(overrides); + const advisoryHint = this.getCloudflareAdvisoryHint(); + const hint = this.buildRecoveryHint(overrides); + if (existing) { + existing.url = url; + existing.hint = hint; + existing.overrides = overrides; + existing.currentState = 'pending'; + existing.failureReason = undefined; + existing.autoRecoverSupported = autoRecoverSupported; + existing.advisoryHint = advisoryHint; + this.recoveries.set(existing.resumeToken, existing); + return existing; + } + + const entry = this.createRecoveryState(siteId, url, overrides, recoveryKind); + entry.hint = hint; + entry.autoRecoverSupported = autoRecoverSupported; + entry.advisoryHint = advisoryHint; + this.recoveries.set(entry.resumeToken, entry); return entry; } + private async ensureBrowserRequiredExecutionSession(skill: SkillSpec): Promise { + if (!this.liveChromeBackend) { + return; + } + + const currentPolicy = getSitePolicy(skill.siteId, this.config); + if ( + currentPolicy.executionBackend === 'live-chrome' + && this.liveChromeBackend.findSession(skill.siteId, currentPolicy.executionSessionName) + ) { + return; + } + + const bootstrapUrl = `https://${skill.allowedDomains[0] ?? skill.siteId}`; + const recovery = this.createRecoveryState(skill.siteId, bootstrapUrl, undefined, 'execute'); + + try { + const { managedBrowser } = await this.connectRecoverySession(recovery); + recovery.managedBrowser = managedBrowser; + await this.bindRecoveryPolicy(recovery); + await this.alignRecoveryPage(recovery); + } catch (err) { + this.log.warn({ err, siteId: skill.siteId }, 'Browser-required execution bootstrap failed'); + } + } + + private getRecoverySessionKind(recoveryKind: RecoveryState['recoveryKind']): NamedSessionKind { + return recoveryKind === 'execute' ? 'recovery_execute_cdp' : 'recovery_explore_cdp'; + } + + private getBoundExecutionSession(siteId: string) { + const policy = getSitePolicy(siteId, this.config); + if (policy.executionBackend !== 'live-chrome' || !policy.executionSessionName) { + return undefined; + } + + const session = this.multiSessionManager.peek(policy.executionSessionName); + if (!session || !session.isCdp || session.siteId !== siteId) { + return undefined; + } + + const browser = session.browserManager.getBrowser(); + if (!browser?.isConnected()) { + return undefined; + } + + return session; + } + private toHandoffResult(entry: RecoveryState): ExploreHandoffRequiredResult { return { status: 'browser_handoff_required', @@ -789,6 +1036,116 @@ export class Engine { }; } + private toExecutionHandoffResult( + entry: RecoveryState, + latencyMs: number, + extra?: { session?: string; managedBrowser?: boolean; failureDetail?: string }, + ): SkillExecutionResult { + const handoff = this.toHandoffResult(entry); + return { + success: false, + status: 'browser_handoff_required', + reason: handoff.reason, + recoveryMode: handoff.recoveryMode, + siteId: handoff.siteId, + url: handoff.url, + hint: handoff.hint, + ...(handoff.resumeToken ? { resumeToken: handoff.resumeToken } : {}), + ...(handoff.advisoryHint ? { advisoryHint: handoff.advisoryHint } : {}), + ...(extra?.session ? { session: extra.session } : {}), + ...(extra?.managedBrowser !== undefined ? { managedBrowser: extra.managedBrowser } : {}), + failureCause: FailureCause.CLOUDFLARE_CHALLENGE, + ...(extra?.failureDetail ? { failureDetail: extra.failureDetail } : {}), + latencyMs, + }; + } + + private async detectExecutionChallenge( + skill: SkillSpec, + result: Awaited>, + browserProvider: BrowserProvider | undefined, + stepResults: Array<{ tier: ExecutionTierName; failureCause?: string; success: boolean }>, + ): Promise { + if (result.failureCause === FailureCause.CLOUDFLARE_CHALLENGE) { + return true; + } + + if (stepResults.some(step => step.failureCause === FailureCause.CLOUDFLARE_CHALLENGE)) { + return true; + } + + if (result.success || !browserProvider) { + return false; + } + + const currentUrl = browserProvider.getCurrentUrl?.() ?? ''; + if (/\/cdn-cgi\//i.test(currentUrl)) { + return true; + } + + if (typeof browserProvider.detectChallengePage === 'function') { + try { + return await browserProvider.detectChallengePage(); + } catch (err) { + this.log.debug({ err, skillId: skill.id }, 'Execution challenge detection failed'); + } + } + + return false; + } + + private async maybeStartExecutionRecovery( + skill: SkillSpec, + browserProvider: BrowserProvider | undefined, + latencyMs: number, + failureDetail?: string, + ): Promise { + const baseUrl = browserProvider?.getCurrentUrl?.(); + const fallbackUrl = `https://${skill.allowedDomains[0] ?? skill.siteId}`; + const recovery = this.upsertPendingRecovery(skill.siteId, baseUrl || fallbackUrl, undefined, 'execute'); + + if (!recovery.autoRecoverSupported) { + return this.toExecutionHandoffResult(recovery, latencyMs, { failureDetail }); + } + + try { + const existingExecutionSession = this.getBoundExecutionSession(skill.siteId); + if (existingExecutionSession) { + recovery.recoveryKind = 'explore'; + recovery.cdpSessionName = existingExecutionSession.name; + recovery.managedPid = existingExecutionSession.managedPid; + recovery.managedProfileDir = existingExecutionSession.managedProfileDir ?? recovery.managedProfileDir; + recovery.priorPolicySnapshot = existingExecutionSession.cdpPriorPolicyState ?? recovery.priorPolicySnapshot; + recovery.currentState = 'awaiting_user'; + recovery.managedBrowser = !!existingExecutionSession.managedPid; + this.multiSessionManager.updateSessionKind(existingExecutionSession.name, 'recovery_explore_cdp'); + this.recoveries.set(recovery.resumeToken, recovery); + return this.toExecutionHandoffResult(recovery, latencyMs, { + session: existingExecutionSession.name, + managedBrowser: recovery.managedBrowser, + failureDetail, + }); + } + + const { sessionName, managedBrowser } = await this.connectRecoverySession(recovery); + await this.bindRecoveryPolicy(recovery); + await this.alignRecoveryPage(recovery); + recovery.recoveryKind = 'explore'; + recovery.currentState = 'awaiting_user'; + recovery.managedBrowser = managedBrowser; + this.multiSessionManager.updateSessionKind(sessionName, 'recovery_explore_cdp'); + this.recoveries.set(recovery.resumeToken, recovery); + return this.toExecutionHandoffResult(recovery, latencyMs, { + session: sessionName, + managedBrowser, + failureDetail, + }); + } catch (err) { + this.log.warn({ err, siteId: skill.siteId }, 'Execute-time recovery handoff setup failed'); + return this.toExecutionHandoffResult(recovery, latencyMs, { failureDetail }); + } + } + private startCloudflareHeaderProbe( page: { on(event: 'response', listener: (response: any) => void): void; off(event: 'response', listener: (response: any) => void): void; mainFrame(): unknown }, siteId: string, @@ -1201,10 +1558,12 @@ export class Engine { private async connectRecoverySession(entry: RecoveryState): Promise<{ sessionName: string; managedBrowser: boolean }> { const multiSession = this.multiSessionManager; + const sessionKind = this.getRecoverySessionKind(entry.recoveryKind); const existing = multiSession.get(entry.cdpSessionName); if (existing) { const browser = existing.browserManager.getBrowser(); if (browser?.isConnected()) { + multiSession.updateSessionKind(existing.name, sessionKind); return { sessionName: entry.cdpSessionName, managedBrowser: !!existing.managedPid }; } await multiSession.close(entry.cdpSessionName, { force: true }); @@ -1213,7 +1572,7 @@ export class Engine { if (entry.managedPid && fs.existsSync(entry.managedProfileDir)) { try { const { wsEndpoint } = await waitForDevToolsActivePort(entry.managedProfileDir); - const session = await multiSession.connectCDP(entry.cdpSessionName, { wsEndpoint }, entry.siteId); + const session = await multiSession.connectCDP(entry.cdpSessionName, { wsEndpoint }, entry.siteId, undefined, sessionKind); session.managedPid = entry.managedPid; session.managedProfileDir = entry.managedProfileDir; session.cdpPriorPolicyState = entry.priorPolicySnapshot; @@ -1224,7 +1583,7 @@ export class Engine { } try { - const attached = await multiSession.connectCDP(entry.cdpSessionName, { autoDiscover: true }, entry.siteId); + const attached = await multiSession.connectCDP(entry.cdpSessionName, { autoDiscover: true }, entry.siteId, undefined, sessionKind); attached.managedProfileDir = entry.managedProfileDir; return { sessionName: attached.name, managedBrowser: false }; } catch (attachErr) { @@ -1236,11 +1595,23 @@ export class Engine { }); entry.managedPid = launch.pid; entry.managedBrowser = true; - const session = await multiSession.connectCDP(entry.cdpSessionName, { wsEndpoint: launch.wsEndpoint }, entry.siteId); - session.managedPid = launch.pid; - session.managedProfileDir = entry.managedProfileDir; - session.cdpPriorPolicyState = entry.priorPolicySnapshot; - return { sessionName: session.name, managedBrowser: true }; + try { + const session = await multiSession.connectCDP(entry.cdpSessionName, { wsEndpoint: launch.wsEndpoint }, entry.siteId, undefined, sessionKind); + session.managedPid = launch.pid; + session.managedProfileDir = entry.managedProfileDir; + session.cdpPriorPolicyState = entry.priorPolicySnapshot; + return { sessionName: session.name, managedBrowser: true }; + } catch (launchAttachErr) { + entry.managedPid = undefined; + entry.managedBrowser = false; + try { + await terminateManagedChrome(launch.pid); + } catch (cleanupErr) { + this.log.warn({ cleanupErr, siteId: entry.siteId, pid: launch.pid }, 'Managed Chrome cleanup failed after recovery handoff attach error'); + } + removeManagedChromeMetadata(entry.managedProfileDir); + throw launchAttachErr; + } } } @@ -1854,6 +2225,8 @@ export class Engine { return { skillsGenerated: 0, signalCount: 0, + htmlDocumentCount: 0, + ambiguousCount: 0, noiseCount: 0, totalCount: 0, }; @@ -1901,6 +2274,8 @@ export class Engine { return { skillsGenerated: 0, signalCount: counts.signalCount, + htmlDocumentCount: counts.htmlDocumentCount, + ...(counts.ambiguousCount !== undefined ? { ambiguousCount: counts.ambiguousCount } : {}), noiseCount: counts.noiseCount, totalCount: counts.totalCount, warning: earlyWarning, @@ -1962,6 +2337,7 @@ export class Engine { canonicalHost: cluster.canonicalHost, actionName: generateActionName(cluster.method, cluster.pathTemplate, cluster.canonicalHost, recording.siteId), inputSchema: cluster.bodyShape ? { type: 'object', properties: cluster.bodyShape } : {}, + responseContentType: cluster.responseContentType, requiredHeaders: cluster.commonHeaders, sampleCount: cluster.requests.length, }, @@ -2127,11 +2503,16 @@ export class Engine { if (traffic.length > 0) { const classification = classifySite(recording.siteId, traffic); const siteRepo = new SiteRepository(db); + const policy = getSitePolicy(recording.siteId, this.config); + const sanitizedTier = sanitizeSiteRecommendedTier( + classification.recommendedTier, + policy.browserRequired === true, + ); siteRepo.update(recording.siteId, { - recommendedTier: classification.recommendedTier, + recommendedTier: sanitizedTier, }); this.log.info( - { siteId: recording.siteId, recommendedTier: classification.recommendedTier, authRequired: classification.authRequired }, + { siteId: recording.siteId, recommendedTier: sanitizedTier, authRequired: classification.authRequired }, 'Site classified after recording', ); } @@ -2177,6 +2558,8 @@ export class Engine { return { skillsGenerated: generatedCount, signalCount: counts.signalCount, + htmlDocumentCount: counts.htmlDocumentCount, + ...(counts.ambiguousCount !== undefined ? { ambiguousCount: counts.ambiguousCount } : {}), noiseCount: counts.noiseCount, totalCount: counts.totalCount, warning, @@ -2232,7 +2615,7 @@ export class Engine { skillId: string, params: Record, callerId?: string, - options?: { skipMetrics?: boolean; forceDirectTier?: boolean }, + options?: SkillExecutionOptions, ): Promise { const startTime = Date.now(); @@ -2273,7 +2656,12 @@ export class Engine { // Dedup for read-only skills: collapse identical in-flight requests if (skill.sideEffectClass === SideEffectClass.READ_ONLY) { - const dedupKey = `${skillId}|${stableStringify(params)}`; + const dedupKey = [ + skillId, + stableStringify(params), + options?.skipTransform ? 'raw' : 'transformed', + options?.waitForPermit ? 'wait' : 'check', + ].join('|'); const existing = this.inflightDedup.get(dedupKey); if (existing) { this.log.debug({ skillId }, 'Dedup hit — returning in-flight result'); @@ -2291,18 +2679,202 @@ export class Engine { return this.executeSkillInner(skill, params, startTime, callerId, options); } + async executeBatch( + actions: BatchSkillAction[], + callerId?: string, + ): Promise { + const results: BatchSkillResult[] = new Array(actions.length); + + const executeAction = async ( + action: BatchSkillAction, + index: number, + skill?: SkillSpec, + ): Promise => { + if (!skill) { + results[index] = { skillId: action.skillId, success: false, error: 'Skill not found' }; + return; + } + + try { + let result = await this.executeSkill(skill.id, action.params ?? {}, callerId); + if (!result.success && result.failureCause === 'rate_limited') { + const parsed = parseInt(result.failureDetail?.match(/(\d+)ms/)?.[1] ?? '1000', 10); + const waitMs = Math.min(Math.max(parsed, 100), 30_000); + await new Promise(resolve => setTimeout(resolve, waitMs + 50)); + result = await this.executeSkill(skill.id, action.params ?? {}, callerId); + } + results[index] = { + skillId: action.skillId, + success: result.success, + status: result.status, + data: result.data, + transformApplied: result.transformApplied, + transformLabel: result.transformLabel, + error: result.error, + failureCause: result.failureCause, + failureDetail: result.failureDetail, + reason: result.reason, + recoveryMode: result.recoveryMode, + siteId: result.siteId, + url: result.url, + hint: result.hint, + resumeToken: result.resumeToken, + advisoryHint: result.advisoryHint, + session: result.session, + managedBrowser: result.managedBrowser, + latencyMs: result.latencyMs, + }; + } catch (err) { + results[index] = { + skillId: action.skillId, + success: false, + error: err instanceof Error ? err.message : String(err), + }; + } + }; + + const readWindow: Array<{ action: BatchSkillAction; index: number; skill: SkillSpec }> = []; + const flushReadWindow = async (): Promise => { + if (readWindow.length === 0) { + return; + } + + const grouped = new Map>(); + for (const item of readWindow) { + const key = this.resolveExecutionGroupKey(item.skill); + const group = grouped.get(key); + if (group) { + group.push(item); + } else { + grouped.set(key, [item]); + } + } + + await Promise.all( + [...grouped.values()].map(async (group) => { + const maxConcurrent = this.getBatchExecutionConcurrency(group[0].skill); + for (let start = 0; start < group.length; start += maxConcurrent) { + const chunk = group.slice(start, start + maxConcurrent); + await Promise.all(chunk.map((item) => executeAction(item.action, item.index, item.skill))); + } + }), + ); + + readWindow.length = 0; + }; + + for (let index = 0; index < actions.length; index++) { + const action = actions[index]; + const skill = this.skillRepo.getById(action.skillId); + + if (!skill) { + results[index] = { skillId: action.skillId, success: false, error: 'Skill not found' }; + continue; + } + + if (skill.sideEffectClass === SideEffectClass.READ_ONLY) { + readWindow.push({ action, index, skill }); + continue; + } + + await flushReadWindow(); + await executeAction(action, index, skill); + } + + await flushReadWindow(); + return results; + } + + private getBatchExecutionConcurrency(skill: SkillSpec): number { + const configured = getSitePolicy(skill.siteId, this.config).maxConcurrent; + if (!Number.isFinite(configured)) { + return 3; + } + return Math.max(1, Math.floor(configured)); + } + + private buildWorkflowStepExecutor(callerId?: string) { + return (skillId: string, params: Record) => + this.executeSkill(skillId, params, callerId, { + skipTransform: true, + waitForPermit: { + timeoutMs: WORKFLOW_STEP_PERMIT_TIMEOUT_MS, + }, + }); + } + + private async acquireExecutionRatePermit( + siteId: string, + callerId: string | undefined, + policy: { minGapMs?: number }, + options?: SkillExecutionOptions, + ): Promise<{ allowed: boolean; retryAfterMs?: number }> { + const minGapMs = options?.waitForPermit?.minGapMs ?? policy.minGapMs ?? 0; + if (options?.waitForPermit) { + return this.rateLimiter.waitForPermit(siteId, callerId, { + minGapMs, + timeoutMs: options.waitForPermit.timeoutMs, + }); + } + return this.rateLimiter.checkRate(siteId, callerId, { minGapMs }); + } + private async executeSkillInner( skill: SkillSpec, params: Record, startTime: number, callerId?: string, - options?: { skipMetrics?: boolean; forceDirectTier?: boolean }, + options?: SkillExecutionOptions, ): Promise { const skillId = skill.id; + if (skill.workflowSpec) { + const workflowResult = await executeWorkflow( + skill.workflowSpec, + params, + this.buildWorkflowStepExecutor(callerId), + this.skillRepo, + this.workflowStepCache, + ); + + if ('status' in workflowResult && workflowResult.status === 'browser_handoff_required') { + return workflowResult; + } + + const workflowExecution = workflowResult as Exclude< + Awaited>, + SkillExecutionResult & { status: 'browser_handoff_required' } + >; + + const workflowFailureCause = workflowExecution.success ? undefined : workflowExecution.failureCause; + const isInfra = workflowFailureCause && INFRA_FAILURE_CAUSES.has(workflowFailureCause as any); + + if (workflowExecution.success) { + this.skillRepo.updateConfidence(skill.id, Math.min(skill.confidence + 0.1, 1.0), skill.consecutiveValidations + 1); + } else if (!isInfra) { + this.skillRepo.updateConfidence(skill.id, Math.max(skill.confidence - 0.2, 0), 0); + } + + this.skillRepo.update(skill.id, { lastUsed: Date.now() }); + + const transformed = workflowExecution.success && !options?.skipTransform + ? await applyTransform(workflowExecution.data, skill.outputTransform) + : { data: workflowExecution.data, transformApplied: false as const }; + + return { + success: workflowExecution.success, + data: transformed.data, + ...(transformed.transformApplied ? { transformApplied: true, transformLabel: transformed.label } : {}), + error: workflowExecution.success ? undefined : workflowExecution.error, + failureCause: workflowFailureCause, + failureDetail: workflowExecution.success ? undefined : workflowExecution.failedAtStep ? `Failed at step: ${workflowExecution.failedAtStep}` : undefined, + latencyMs: workflowExecution.totalLatencyMs, + }; + } + // Both WebMCP and HTTP paths produce an ExecutionResult for the shared post-execution block - const policy = getSitePolicy(skill.siteId, this.config); - const effectiveDomains = policy.domainAllowlist.length > 0 + let policy = getSitePolicy(skill.siteId, this.config); + let effectiveDomains = policy.domainAllowlist.length > 0 ? policy.domainAllowlist : [...new Set([...skill.allowedDomains, skill.siteId])]; const policyDecision: PolicyDecision = { @@ -2315,7 +2887,12 @@ export class Engine { const site = this.siteRepo.getById(skill.siteId); const MAX_CANARY_ATTEMPTS = 5; let isCanaryProbe = false; - const isDirectRecommended = site?.recommendedTier === ExecutionTier.DIRECT; + let browserProvider: BrowserProvider | undefined; + const browserRequiredSkill = skill.tierLock?.type === 'permanent' && skill.tierLock.reason === 'browser_required'; + const siteRecommendedTier = site + ? sanitizeSiteRecommendedTier(site.recommendedTier, policy.browserRequired === true) + : undefined; + const isDirectRecommended = siteRecommendedTier === ExecutionTier.DIRECT; let result: Awaited>; @@ -2323,7 +2900,7 @@ export class Engine { // ── WebMCP execution path ──────────────────────────── // Bypasses HTTP method/path checks and the replay pipeline, // but enforces rate limiting and flows through the shared post-execution path. - const rateCheck = this.rateLimiter.checkRate(skill.siteId, callerId); + const rateCheck = await this.acquireExecutionRatePermit(skill.siteId, callerId, policy, options); if (!rateCheck.allowed) { const detail = `Site '${skill.siteId}' rate limited, retry after ${rateCheck.retryAfterMs}ms`; return { @@ -2376,8 +2953,20 @@ export class Engine { }; } + if (options?.forceDirectTier && policy.browserRequired) { + const detail = `Direct probe suppressed for challenge-protected site '${skill.siteId}'`; + return { + success: false, + error: formatExecutionError('policy_denied', detail), + failureCause: 'policy_denied', + failureDetail: detail, + probeSuppressed: true, + latencyMs: Date.now() - startTime, + }; + } + // 3. Rate limit check (with per-caller fairness) - const rateCheck = this.rateLimiter.checkRate(skill.siteId, callerId); + const rateCheck = await this.acquireExecutionRatePermit(skill.siteId, callerId, policy, options); if (!rateCheck.allowed) { this.log.warn( { skillId, siteId: skill.siteId, retryAfterMs: rateCheck.retryAfterMs }, @@ -2393,10 +2982,17 @@ export class Engine { }; } + if (browserRequiredSkill && !options?.forceDirectTier) { + await this.ensureBrowserRequiredExecutionSession(skill); + policy = getSitePolicy(skill.siteId, this.config); + effectiveDomains = policy.domainAllowlist.length > 0 + ? policy.domainAllowlist + : [...new Set([...skill.allowedDomains, skill.siteId])]; + } + this.budgetTracker.setDomainAllowlist(effectiveDomains); // Wire browser provider: try execution backend first, fall back to explore Playwright - let browserProvider: BrowserProvider | undefined; const isHardSite = !!( (policy.executionBackend === 'playwright' || policy.executionBackend === 'live-chrome') && policy.executionSessionName @@ -2435,18 +3031,23 @@ export class Engine { browserProvider, browserProviderFactory, config: this.config, - siteRecommendedTier: site?.recommendedTier, + siteRecommendedTier, }; // 5. Execute — canary probe + tier escalation - const retryOpts: RetryOptions = { ...executorOptions, siteRecommendedTier: site?.recommendedTier }; + const retryOpts: RetryOptions = { + ...executorOptions, + siteRecommendedTier, + directAllowed: policy.browserRequired !== true, + }; // Auto-validation forces direct tier to test without browser if (options?.forceDirectTier) { retryOpts.forceStartTier = ExecutionTier.DIRECT; } else if (skill.directCanaryEligible && !skill.tierLock && (skill.directCanaryAttempts ?? 0) < MAX_CANARY_ATTEMPTS - && !isDirectRecommended) { + && !isDirectRecommended + && policy.browserRequired !== true) { retryOpts.forceStartTier = ExecutionTier.DIRECT; retryOpts.isCanaryProbe = true; isCanaryProbe = true; @@ -2463,14 +3064,38 @@ export class Engine { } try { + const stepResults: Array<{ + tier: ExecutionTierName; + failureCause?: string; + success: boolean; + }> = ('stepResults' in result && Array.isArray(result.stepResults)) + ? result.stepResults as Array<{ tier: ExecutionTierName; failureCause?: string; success: boolean }> + : []; + const executionChallengeDetected = !options?.forceDirectTier + && await this.detectExecutionChallenge(skill, result, browserProvider, stepResults); + if (executionChallengeDetected && result.failureCause !== FailureCause.CLOUDFLARE_CHALLENGE) { + result.failureCause = FailureCause.CLOUDFLARE_CHALLENGE; + result.failureDetail ??= 'Cloudflare challenge detected during browser execution'; + } + const firstFailedStep = stepResults.find(step => !step.success); + const directChallengeSeen = ( + result.failureCause === FailureCause.CLOUDFLARE_CHALLENGE + && result.tier === ExecutionTier.DIRECT + ) || stepResults.some(step => + step.tier === ExecutionTier.DIRECT + && step.failureCause === FailureCause.CLOUDFLARE_CHALLENGE, + ); + const anyChallengeSeen = result.failureCause === FailureCause.CLOUDFLARE_CHALLENGE + || stepResults.some(step => step.failureCause === FailureCause.CLOUDFLARE_CHALLENGE); // 6. Update rate limiter with response info (pass latency for adaptive throttling on non-browser tiers) const adaptiveLatency = result.tier !== ExecutionTier.FULL_BROWSER ? result.latencyMs : undefined; this.rateLimiter.recordResponse(skill.siteId, result.status, result.headers, adaptiveLatency, callerId); // WS-4: Persist canary failure cause - if (isCanaryProbe && 'stepResults' in result && (result as any).stepResults[0] && !(result as any).stepResults[0].success) { - this.skillRepo.update(skill.id, { lastCanaryErrorType: (result as any).stepResults[0].failureCause ?? 'unknown' }); + if (isCanaryProbe && firstFailedStep) { + const firstCanaryFailure = directChallengeSeen ? FailureCause.CLOUDFLARE_CHALLENGE : (firstFailedStep.failureCause ?? 'unknown'); + this.skillRepo.update(skill.id, { lastCanaryErrorType: firstCanaryFailure }); } // 7. Record metrics (skip for infra failures — they don't reflect skill health) @@ -2512,7 +3137,9 @@ export class Engine { // WS-4: Canary re-arm — only for non-direct-recommended tier_3 skills on non-direct success if (result.success && result.tier !== ExecutionTier.DIRECT && skill.currentTier === TierState.TIER_3_DEFAULT - && !isDirectRecommended) { + && !isDirectRecommended + && !anyChallengeSeen + && policy.browserRequired !== true) { this.skillRepo.incrementValidationsSinceLastCanary(skill.id); const fresh = this.skillRepo.getById(skill.id); @@ -2529,16 +3156,47 @@ export class Engine { // A2: Handle structural failures — tier lock const structuralCauses: ReadonlySet = new Set(['js_computed_field', 'protocol_sensitivity', 'signed_payload']); - if (result.failureCause && structuralCauses.has(result.failureCause as PermanentTierLock['reason'])) { + if (directChallengeSeen) { + const failResult = handleFailure(skill, FailureCause.CLOUDFLARE_CHALLENGE); + this.skillRepo.updateTier(skill.id, failResult.newTier, failResult.tierLock); + this.skillRepo.update(skill.id, { + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + ...(isCanaryProbe ? { lastCanaryErrorType: FailureCause.CLOUDFLARE_CHALLENGE } : {}), + }); + + if (!policy.browserRequired) { + mergeSitePolicy(skill.siteId, { browserRequired: true }, this.config); + } + + const nextRecommendedTier = sanitizeSiteRecommendedTier( + site?.recommendedTier ?? ExecutionTier.BROWSER_PROXIED, + true, + ); + this.siteRepo.update(skill.siteId, { recommendedTier: nextRecommendedTier }); + } else if (result.failureCause && structuralCauses.has(result.failureCause as PermanentTierLock['reason'])) { const failResult = handleFailure(skill, result.failureCause); this.skillRepo.updateTier(skill.id, failResult.newTier, failResult.tierLock); } + if (!result.success && executionChallengeDetected) { + const handoff = await this.maybeStartExecutionRecovery( + skill, + browserProvider, + result.latencyMs, + result.failureDetail, + ); + if (handoff) { + return handoff; + } + } + // A3: Tier promotion check — only when direct execution succeeded on a tier_3 skill if (result.success && result.tier === ExecutionTier.DIRECT && skill.currentTier === TierState.TIER_3_DEFAULT) { const updatedSkill = this.skillRepo.getById(skill.id); if (updatedSkill) { - const promoCheck = checkPromotion(updatedSkill, [], { match: true, hasDynamicRequiredFields: false }, this.config, site?.recommendedTier); + const promoCheck = checkPromotion(updatedSkill, [], { match: true, hasDynamicRequiredFields: false }, this.config, siteRecommendedTier); if (promoCheck.promote) { this.skillRepo.updateTier(updatedSkill.id, TierState.TIER_1_PROMOTED, null); this.skillRepo.update(updatedSkill.id, { directCanaryEligible: false, directCanaryAttempts: 0, validationsSinceLastCanary: 0 }); @@ -2706,9 +3364,14 @@ export class Engine { }); } + const transformed = options?.skipTransform + ? { data: result.data, transformApplied: false as const } + : await applyTransform(result.data, skill.outputTransform); + return { success: result.success, - data: result.data, + data: transformed.data, + ...(transformed.transformApplied ? { transformApplied: true, transformLabel: transformed.label } : {}), error: result.failureCause ? formatExecutionError(result.failureCause, result.failureDetail ?? 'unknown') : undefined, @@ -3009,6 +3672,12 @@ export class Engine { continue; } + const policy = getSitePolicy(skill.siteId, this.config); + if (policy.browserRequired) { + this.autoValidationStats.skippedBrowserRequired++; + continue; + } + const rateCheck = this.rateLimiter.checkRate(skill.siteId, '__auto_validation__'); if (!rateCheck.allowed) { this.autoValidationStats.skippedRateLimited++; @@ -3016,6 +3685,10 @@ export class Engine { } const result = await this.executeSkill(skill.id, samples, '__auto_validation__', { forceDirectTier: true }); + if (result.probeSuppressed) { + this.autoValidationStats.skippedBrowserRequired++; + continue; + } this.autoValidationStats.validated++; cycleProcessed++; @@ -3167,6 +3840,16 @@ export class Engine { } catch (err) { this.log.debug({ err }, 'Managed Chrome cleanup failed during engine close'); } + try { + await cleanupOwnedBrowserLaunches(this.config); + } catch (err) { + this.log.debug({ err }, 'Owned browser launch cleanup failed during engine close'); + } + try { + await cleanupAgentBrowserSessions(this.config); + } catch (err) { + this.log.debug({ err }, 'Agent-browser cleanup failed during engine close'); + } this.mode = 'idle'; this.providerCache = new WeakMap(); diff --git a/src/core/policy.ts b/src/core/policy.ts index 4e95211..ede9305 100644 --- a/src/core/policy.ts +++ b/src/core/policy.ts @@ -52,11 +52,13 @@ const DEFAULT_SITE_POLICY: Omit = { allowedMethods: ['GET', 'HEAD'] as HttpMethod[], maxQps: 10, maxConcurrent: 3, + minGapMs: 100, readOnlyDefault: true, requireConfirmation: [], domainAllowlist: [], redactionRules: [], capabilities: [...V01_DEFAULT_CAPABILITIES], + browserRequired: false, }; const POLICY_CACHE_TTL_MS = 300_000; // 5 minutes @@ -79,11 +81,13 @@ function loadPolicyFromDb(siteId: string, config?: SchruteConfig): SitePolicy | allowed_methods: string; max_qps: number; max_concurrent: number; + min_gap_ms: number | null; read_only_default: number; require_confirmation: string; domain_allowlist: string | null; redaction_rules: string; capabilities: string; + browser_required: number | null; execution_backend: string | null; execution_session_name: string | null; }>('SELECT * FROM policies WHERE site_id = ?', siteId); @@ -98,19 +102,21 @@ function loadPolicyFromDb(siteId: string, config?: SchruteConfig): SitePolicy | log.warn({ siteId, invalid: rawMethods.filter(m => !VALID_HTTP_METHODS.has(m)) }, 'Filtered invalid HTTP methods from persisted policy'); } - return { + return normalizeSitePolicy({ siteId: row.site_id, allowedMethods, maxQps: row.max_qps, maxConcurrent: row.max_concurrent, + minGapMs: row.min_gap_ms ?? DEFAULT_SITE_POLICY.minGapMs, readOnlyDefault: row.read_only_default === 1, requireConfirmation: JSON.parse(row.require_confirmation), domainAllowlist: row.domain_allowlist ? JSON.parse(row.domain_allowlist) : [], redactionRules: JSON.parse(row.redaction_rules), capabilities: JSON.parse(row.capabilities), + browserRequired: row.browser_required === 1, executionBackend: (row.execution_backend as SitePolicy['executionBackend']) ?? undefined, executionSessionName: row.execution_session_name ?? undefined, - }; + }); } catch (err) { const policyLog = getLogger(); policyLog.error( @@ -121,6 +127,46 @@ function loadPolicyFromDb(siteId: string, config?: SchruteConfig): SitePolicy | } } +const VALID_EXECUTION_BACKENDS = new Set>([ + 'playwright', + 'agent-browser', + 'live-chrome', +]); + +function normalizeSitePolicy(policy: SitePolicy): SitePolicy { + return { + ...DEFAULT_SITE_POLICY, + ...policy, + browserRequired: policy.browserRequired === true, + }; +} + +function validateSitePolicy(policy: SitePolicy): void { + if (policy.minGapMs !== undefined && (!Number.isFinite(policy.minGapMs) || policy.minGapMs < 0)) { + throw new Error(`minGapMs must be a finite number >= 0. Got '${policy.minGapMs}'.`); + } + + if (policy.browserRequired !== undefined && typeof policy.browserRequired !== 'boolean') { + throw new Error(`browserRequired must be boolean when provided. Got '${typeof policy.browserRequired}'.`); + } + + if (policy.executionBackend !== undefined && !VALID_EXECUTION_BACKENDS.has(policy.executionBackend)) { + throw new Error( + `executionBackend must be one of: ${[...VALID_EXECUTION_BACKENDS].join(', ')}. ` + + `Got executionBackend='${policy.executionBackend}'.`, + ); + } + + if (policy.executionSessionName + && policy.executionBackend !== 'playwright' + && policy.executionBackend !== 'live-chrome') { + throw new Error( + `executionSessionName requires executionBackend='playwright' or 'live-chrome'. ` + + `Got executionBackend='${policy.executionBackend ?? 'undefined'}'.`, + ); + } +} + export function getSitePolicy(siteId: string, config?: SchruteConfig): SitePolicy { const key = policyCacheKey(siteId, config); const cached = sitePolicies.get(key); @@ -135,22 +181,15 @@ export function getSitePolicy(siteId: string, config?: SchruteConfig): SitePolic return dbPolicy; } - return { siteId, ...DEFAULT_SITE_POLICY }; + return normalizeSitePolicy({ siteId, ...DEFAULT_SITE_POLICY }); } export function setSitePolicy(policy: SitePolicy, config?: SchruteConfig): { persisted: boolean } { - // Validate: executionSessionName requires executionBackend='playwright' - if (policy.executionSessionName - && policy.executionBackend !== 'playwright' - && policy.executionBackend !== 'live-chrome') { - throw new Error( - `executionSessionName requires executionBackend='playwright' or 'live-chrome'. ` + - `Got executionBackend='${policy.executionBackend ?? 'undefined'}'.`, - ); - } + validateSitePolicy(policy); + const normalized = normalizeSitePolicy(policy); - const key = policyCacheKey(policy.siteId, config); - sitePolicies.set(key, policy); + const key = policyCacheKey(normalized.siteId, config); + sitePolicies.set(key, normalized); try { const db = getDatabase(config); @@ -161,23 +200,23 @@ export function setSitePolicy(policy: SitePolicy, config?: SchruteConfig): { per // we only need the row to exist, not to update it. db.run( `INSERT OR IGNORE INTO sites (id, first_seen, last_visited) VALUES (?, ?, ?)`, - policy.siteId, Date.now(), Date.now(), + normalized.siteId, Date.now(), Date.now(), ); db.run( - `INSERT OR REPLACE INTO policies (site_id, allowed_methods, max_qps, max_concurrent, read_only_default, + `INSERT OR REPLACE INTO policies (site_id, allowed_methods, max_qps, max_concurrent, min_gap_ms, read_only_default, require_confirmation, domain_allowlist, redaction_rules, capabilities, - execution_backend, execution_session_name) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, - policy.siteId, JSON.stringify(policy.allowedMethods), policy.maxQps, policy.maxConcurrent, - policy.readOnlyDefault ? 1 : 0, JSON.stringify(policy.requireConfirmation), - JSON.stringify(policy.domainAllowlist), JSON.stringify(policy.redactionRules), - JSON.stringify(policy.capabilities), - policy.executionBackend ?? null, policy.executionSessionName ?? null, + browser_required, execution_backend, execution_session_name) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, + normalized.siteId, JSON.stringify(normalized.allowedMethods), normalized.maxQps, normalized.maxConcurrent, + normalized.minGapMs ?? DEFAULT_SITE_POLICY.minGapMs, normalized.readOnlyDefault ? 1 : 0, JSON.stringify(normalized.requireConfirmation), + JSON.stringify(normalized.domainAllowlist), JSON.stringify(normalized.redactionRules), + JSON.stringify(normalized.capabilities), normalized.browserRequired ? 1 : 0, + normalized.executionBackend ?? null, normalized.executionSessionName ?? null, ); return { persisted: true }; } catch (err) { const log = getLogger(); - log.warn({ siteId: policy.siteId, err }, 'Failed to persist site policy to database'); + log.warn({ siteId: normalized.siteId, err }, 'Failed to persist site policy to database'); return { persisted: false }; } } diff --git a/src/core/tiering.ts b/src/core/tiering.ts index 1a01390..9b16782 100644 --- a/src/core/tiering.ts +++ b/src/core/tiering.ts @@ -34,6 +34,14 @@ interface SemanticResult { hasDynamicRequiredFields: boolean; } +const PERMANENT_LOCK_REASON_LABELS: Record = { + js_computed_field: 'JS-computed field', + protocol_sensitivity: 'Protocol sensitivity', + signed_payload: 'Signed payload', + webmcp_requires_browser: 'WebMCP requires browser', + browser_required: 'Browser required', +}; + // ─── Tier State Machine ───────────────────────────────────────── /** @@ -82,7 +90,7 @@ export function checkPromotion( return { promote: false, lock: skill.tierLock, - reason: `Permanently locked: ${skill.tierLock.reason}`, + reason: `Permanently locked: ${formatPermanentTierLockReason(skill.tierLock.reason)}`, }; } @@ -169,12 +177,16 @@ export function handleFailure( FailureCause.JS_COMPUTED_FIELD, FailureCause.PROTOCOL_SENSITIVITY, FailureCause.SIGNED_PAYLOAD, + FailureCause.CLOUDFLARE_CHALLENGE, ]; if (permanentCauses.includes(failureCause)) { + const reason: PermanentTierLock['reason'] = failureCause === FailureCause.CLOUDFLARE_CHALLENGE + ? 'browser_required' + : failureCause as Exclude; const lock: PermanentTierLock = { type: 'permanent', - reason: failureCause as PermanentTierLock['reason'], + reason, evidence: `Failure cause: ${failureCause}`, }; @@ -186,7 +198,7 @@ export function handleFailure( return { newTier: TierState.TIER_3_DEFAULT as TierStateName, tierLock: lock, - reason: `Permanent lock: ${failureCause}`, + reason: `Permanent lock: ${formatPermanentTierLockReason(lock.reason)}`, }; } @@ -226,3 +238,24 @@ export function getEffectiveTier(skill: SkillSpec): TierStateName { return skill.currentTier; } + +export function formatPermanentTierLockReason(reason: PermanentTierLock['reason']): string { + return PERMANENT_LOCK_REASON_LABELS[reason] ?? reason; +} + +export function sanitizeSiteRecommendedTier( + recommendedTier: ExecutionTierName, + browserRequired: boolean, +): ExecutionTierName { + if (browserRequired) { + return recommendedTier === ExecutionTier.FULL_BROWSER + ? ExecutionTier.FULL_BROWSER + : ExecutionTier.BROWSER_PROXIED; + } + + if (recommendedTier === ExecutionTier.COOKIE_REFRESH) { + return ExecutionTier.BROWSER_PROXIED; + } + + return recommendedTier; +} diff --git a/src/discovery/openapi-scanner.ts b/src/discovery/openapi-scanner.ts index 455b65b..5a4a76b 100644 --- a/src/discovery/openapi-scanner.ts +++ b/src/discovery/openapi-scanner.ts @@ -17,6 +17,11 @@ export const PROBE_PATHS = [ '/v3/api-docs', '/.well-known/openapi.json', '/api/openapi.json', + '/spec.json', + '/docs/openapi.json', + '/swagger/v1/swagger.json', + '/api/v1/openapi.json', + '/api/swagger.json', ]; // ─── Public API ────────────────────────────────────────────────────── diff --git a/src/doctor.ts b/src/doctor.ts index 6afbb33..4c2bce0 100644 --- a/src/doctor.ts +++ b/src/doctor.ts @@ -348,7 +348,7 @@ function checkAuditHashChain(config: SchruteConfig): CheckResult { if (!verification.valid) { return { name: 'audit_hash_chain', - status: 'fail', + status: 'warning', message: `Audit hash chain broken at entry ${verification.brokenAt} (${verification.totalEntries} entries)`, details: `${verification.message ?? ''} (expected when database is shared across dev sessions or keychain key was rotated)`.trim(), }; diff --git a/src/index.ts b/src/index.ts index 280b5b0..c41cd74 100644 --- a/src/index.ts +++ b/src/index.ts @@ -1,5 +1,6 @@ #!/usr/bin/env node import * as fs from 'node:fs'; +import * as readline from 'node:readline'; import { Command } from 'commander'; import { getConfig, setConfigValue, ensureDirectories } from './core/config.js'; import { createLogger } from './core/logger.js'; @@ -18,6 +19,7 @@ import { validateSkill } from './skill/validator.js'; import { validateImportableSkill, validateImportableSite } from './storage/import-validator.js'; import type { SkillSpec, SiteManifest, SitePolicy } from './skill/types.js'; import { getSitePolicy } from './core/policy.js'; +import { formatPermanentTierLockReason } from './core/tiering.js'; import { VERSION } from './version.js'; import { ConfigError } from './core/config.js'; @@ -33,10 +35,12 @@ program // ─── Helper: get remote client from global opts ───────────────── -function getRemoteClient(): RemoteClient | null { - const opts = program.opts<{ url?: string; token?: string; json?: boolean }>(); - if (!opts.url) return null; - return new RemoteClient(opts.url, opts.token); +function getRemoteClient(cmdOpts?: { url?: string; token?: string }): RemoteClient | null { + const globalOpts = program.opts<{ url?: string; token?: string; json?: boolean }>(); + const url = cmdOpts?.url ?? globalOpts.url; + const token = cmdOpts?.token ?? globalOpts.token; + if (!url) return null; + return new RemoteClient(url, token); } function outputResult(data: unknown): void { @@ -50,21 +54,86 @@ function outputResult(data: unknown): void { } } +function parseJsonInput(label: string, value: string): T { + try { + return JSON.parse(value) as T; + } catch (err) { + throw new Error(`Invalid ${label} JSON: ${err instanceof Error ? err.message : String(err)}`); + } +} + +// ─── Helper: progress indicator ───────────────────────────────── + +function startProgress(msg: string): () => void { + if (!process.stderr.isTTY) return () => {}; + process.stderr.write(msg); + const id = setInterval(() => process.stderr.write('.'), 1000); + return () => { clearInterval(id); process.stderr.write('\n'); }; +} + +// ─── Helper: retry on 429 ─────────────────────────────────────── + +async function executeWithRetry( + fn: () => Promise, + retries = 2, + delayMs = 2000, +): Promise { + for (let attempt = 0; ; attempt++) { + const result = await fn(); + const data = result as Record; + // Daemon returns { status: 'executed', result: { failureCause: 'rate_limited', failureDetail: '...NNNms...' } } + const inner = (data?.result ?? data) as Record | undefined; + if (inner?.failureCause === 'rate_limited' && attempt < retries) { + const detail = typeof inner.failureDetail === 'string' ? inner.failureDetail : ''; + const wait = parseInt(detail.match(/(\d+)ms/)?.[1] ?? String(delayMs)); + console.error(`Rate limited — retrying in ${wait}ms...`); + await new Promise(r => setTimeout(r, wait + 100)); + continue; + } + return result; + } +} + +async function executeWithRetryRemote( + fn: () => Promise, + retries = 2, + delayMs = 2000, +): Promise { + for (let attempt = 0; ; attempt++) { + try { + return await fn(); + } catch (err) { + const msg = err instanceof Error ? err.message : String(err); + if (/429|rate.limit/i.test(msg) && attempt < retries) { + console.error(`Rate limited — retrying in ${delayMs}ms...`); + await new Promise(r => setTimeout(r, delayMs)); + continue; + } + throw err; + } + } +} + // ─── explore ──────────────────────────────────────────────────── program .command('explore ') .description('Open a browser session to explore a website') .addHelpText('after', '\n(requires local daemon or --url)') - .action(async (url: string) => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (url: string, cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { + const stop = startProgress('Exploring'); try { const result = await remote.explore(url); outputResult(result); } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); } return; } @@ -74,6 +143,7 @@ program ensureDirectories(config); const client = createDaemonClient(config); + const stop = startProgress('Exploring'); try { const result = await client.request('POST', '/ctl/explore', { url }); console.log('Explore session started:'); @@ -81,6 +151,8 @@ program } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); } }); @@ -90,8 +162,10 @@ program .command('status') .description('Show server status') .addHelpText('after', '\n(requires local daemon or --url)') - .action(async () => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { try { const result = await remote.getStatus(); @@ -122,8 +196,10 @@ program .command('sessions') .description('List active sessions') .addHelpText('after', '\n(requires local daemon or --url)') - .action(async () => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { try { const result = await remote.listSessions(); @@ -156,7 +232,9 @@ program .addHelpText('after', '\n(requires local daemon or --url)') .requiredOption('--name ', 'Name for the action frame') .option('--input ', 'Input key=value pairs') - .action(async (options: { name: string; input?: string[] }) => { + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (options: { name: string; input?: string[]; url?: string; token?: string }) => { // Parse input pairs let inputs: Record | undefined; if (options.input) { @@ -169,7 +247,7 @@ program } } - const remote = getRemoteClient(); + const remote = getRemoteClient(options); if (remote) { try { const result = await remote.startRecording(options.name, inputs); @@ -202,15 +280,20 @@ program .command('stop') .description('Stop recording and process the action frame') .addHelpText('after', '\n(requires local daemon or --url)') - .action(async () => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { + const stop = startProgress('Stopping'); try { const result = await remote.stopRecording(); outputResult(result); } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); } return; } @@ -219,6 +302,7 @@ program createLogger(config.logLevel); const client = createDaemonClient(config); + const stop = startProgress('Stopping'); try { const result = await client.request('POST', '/ctl/stop'); console.log('Recording stopped:'); @@ -226,6 +310,42 @@ program } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); + } + }); + +// ─── pipeline ─────────────────────────────────────────────────── + +program + .command('pipeline ') + .description('Show background pipeline job status') + .addHelpText('after', '\n(requires local daemon or --url)') + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (jobId: string, cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); + if (remote) { + try { + const result = await remote.getPipelineStatus(jobId); + outputResult(result); + } catch (err) { + console.error('Error:', err instanceof Error ? err.message : String(err)); + process.exit(1); + } + return; + } + + const config = getConfig(); + createLogger(config.logLevel); + + const client = createDaemonClient(config); + try { + const result = await client.request('GET', `/ctl/pipeline/${encodeURIComponent(jobId)}`); + outputResult(result); + } catch (err) { + console.error('Error:', err instanceof Error ? err.message : String(err)); + process.exit(1); } }); @@ -242,11 +362,21 @@ const skillsCmd = program skillsCmd .command('list [site]') .description('List skills, optionally filtered by site') - .action(async (site?: string) => { - const remote = getRemoteClient(); + .option('--status ', 'Filter by status (draft, active, stale, broken)') + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (site: string | undefined, cmdOpts: { status?: string; url?: string; token?: string }) => { + const { SkillStatus } = await import('./skill/types.js'); + const validStatuses = Object.values(SkillStatus) as string[]; + if (cmdOpts.status && !validStatuses.includes(cmdOpts.status)) { + console.error(`Invalid status '${cmdOpts.status}'. Valid: ${validStatuses.join(', ')}`); + process.exit(1); + } + + const remote = getRemoteClient(cmdOpts); if (remote) { try { - const result = await remote.listSkills(site ?? undefined); + const result = await remote.listSkills(site ?? undefined, cmdOpts.status); outputResult(result); } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); @@ -269,6 +399,10 @@ skillsCmd skills = skillRepo.getAll(); } + if (cmdOpts.status) { + skills = skills.filter(s => s.status === cmdOpts.status); + } + if (program.opts().json) { outputResult(skills); closeDatabase(); @@ -297,8 +431,10 @@ skillsCmd .description('Search skills by query') .option('--limit ', 'Max results', '20') .option('--site ', 'Filter to a specific site') - .action(async (query?: string, options?: { limit?: string; site?: string }) => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (query?: string, options?: { limit?: string; site?: string; url?: string; token?: string }) => { + const remote = getRemoteClient(options); if (remote) { try { const limit = options?.limit ? parseInt(options.limit, 10) : undefined; @@ -410,7 +546,7 @@ skillsCmd if (skill.currentTier === 'tier_1') { // skip — already direct } else if (skill.tierLock?.type === 'permanent') { - whyNotDirect = `Permanently locked: ${skill.tierLock.reason}`; + whyNotDirect = `Permanently locked: ${formatPermanentTierLockReason(skill.tierLock.reason)}`; } else if ((skill.directCanaryAttempts ?? 0) > 0 && !skill.directCanaryEligible) { whyNotDirect = `Direct canary failed (${skill.lastCanaryErrorType ?? 'unknown'}). ${skill.directCanaryAttempts} attempts.`; } else if (skill.directCanaryEligible) { @@ -430,6 +566,121 @@ skillsCmd closeDatabase(); }); +skillsCmd + .command('set-transform ') + .description('Set or clear a skill output transform') + .option('--transform ', 'Transform JSON definition') + .option('--response-content-type ', 'Override response content type, e.g. text/html') + .option('--clear', 'Clear the current transform') + .action((skillId: string, options: { transform?: string; responseContentType?: string; clear?: boolean }) => { + if (options.clear && options.transform) { + console.error('Error: --clear cannot be combined with --transform.'); + process.exit(1); + } + if (!options.clear && !options.transform && !options.responseContentType) { + console.error('Error: provide --transform, --response-content-type, or --clear.'); + process.exit(1); + } + + const config = getConfig(); + createLogger(config.logLevel); + ensureDirectories(config); + + const db = getDatabase(config); + const skillRepo = new SkillRepository(db); + const skill = skillRepo.getById(skillId); + if (!skill) { + console.error(`Skill '${skillId}' not found.`); + closeDatabase(); + process.exit(1); + } + + let transform = undefined; + if (options.transform) { + try { + transform = parseJsonInput('transform', options.transform); + } catch (err) { + console.error(err instanceof Error ? err.message : String(err)); + closeDatabase(); + process.exit(1); + } + } + + const nextSkill: SkillSpec = { + ...skill, + ...(options.clear ? { outputTransform: undefined } : {}), + ...(transform !== undefined ? { outputTransform: transform } : {}), + ...(options.responseContentType !== undefined ? { responseContentType: options.responseContentType } : {}), + }; + const validation = validateImportableSkill(nextSkill); + if (!validation.valid) { + console.error(`Error: ${validation.errors.join('; ')}`); + closeDatabase(); + process.exit(1); + } + + skillRepo.update(skill.id, { + ...(options.clear ? { outputTransform: null as unknown as SkillSpec['outputTransform'] } : {}), + ...(transform !== undefined ? { outputTransform: transform } : {}), + ...(options.responseContentType !== undefined ? { responseContentType: options.responseContentType } : {}), + }); + + const updated = skillRepo.getById(skill.id); + console.log(JSON.stringify({ + updated: true, + skillId: skill.id, + outputTransform: updated?.outputTransform, + responseContentType: updated?.responseContentType, + }, null, 2)); + closeDatabase(); + }); + +skillsCmd + .command('export ') + .description('Export a skill as standalone code') + .requiredOption('--format ', 'Export format: curl, fetch.ts, requests.py, playwright.ts') + .option('--params ', 'Parameter JSON used to resolve the request') + .action(async (skillId: string, options: { format: string; params?: string }) => { + const format = options.format as 'curl' | 'fetch.ts' | 'requests.py' | 'playwright.ts'; + if (!['curl', 'fetch.ts', 'requests.py', 'playwright.ts'].includes(format)) { + console.error(`Invalid format '${options.format}'. Valid: curl, fetch.ts, requests.py, playwright.ts`); + process.exit(1); + } + + let params: Record | undefined; + if (options.params) { + try { + params = parseJsonInput>('params', options.params); + } catch (err) { + console.error(err instanceof Error ? err.message : String(err)); + process.exit(1); + } + } + + const config = getConfig(); + createLogger(config.logLevel); + ensureDirectories(config); + + const db = getDatabase(config); + const skillRepo = new SkillRepository(db); + const skill = skillRepo.getById(skillId); + if (!skill) { + console.error(`Skill '${skillId}' not found.`); + closeDatabase(); + process.exit(1); + } + + const { generateExport } = await import('./skill/generator.js'); + const code = generateExport(skill, format, params); + + if (program.opts<{ json?: boolean }>().json) { + outputResult({ skillId, format, code }); + } else { + console.log(code); + } + closeDatabase(); + }); + skillsCmd .command('validate ') .description('Trigger validation for a skill') @@ -505,7 +756,8 @@ skillsCmd skillsCmd .command('delete ') .description('Permanently delete a skill') - .action(async (skillId: string) => { + .option('--yes', 'Skip confirmation') + .action(async (skillId: string, options: { yes?: boolean }) => { const config = getConfig(); createLogger(config.logLevel); ensureDirectories(config); @@ -518,6 +770,27 @@ skillsCmd closeDatabase(); process.exit(1); } + + if (!options.yes) { + console.log(`About to delete skill '${skillId}':`); + console.log(` Name: ${skill.name}`); + console.log(` Method: ${skill.method} ${skill.pathTemplate}`); + console.log(` Status: ${skill.status}`); + if (!process.stdin.isTTY) { + console.error('Non-interactive terminal: use --yes to confirm deletion.'); + closeDatabase(); + process.exit(1); + } + const rl = readline.createInterface({ input: process.stdin, output: process.stdout }); + const answer = await new Promise(resolve => rl.question('Delete this skill? [y/N] ', resolve)); + rl.close(); + if (answer.toLowerCase() !== 'y') { + console.log('Aborted.'); + closeDatabase(); + return; + } + } + skillRepo.delete(skillId); console.log(`Deleted skill '${skillId}' (${skill.name}).`); closeDatabase(); @@ -526,8 +799,10 @@ skillsCmd skillsCmd .command('revoke ') .description('Revoke permanent approval for a skill') - .action(async (skillId: string) => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (skillId: string, cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { try { const result = await remote.revokeApproval(skillId); @@ -576,8 +851,10 @@ skillsCmd skillsCmd .command('amendments ') .description('List amendments for a skill') - .action(async (skillId: string) => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (skillId: string, cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { try { const result = await remote.request('GET', `/skills/${encodeURIComponent(skillId)}/amendments`); @@ -603,8 +880,10 @@ skillsCmd skillsCmd .command('optimize ') .description('Run GEPA offline optimization on a skill') - .action(async (skillId: string) => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (skillId: string, cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { try { const result = await remote.request('POST', `/skills/${encodeURIComponent(skillId)}/optimize`); @@ -745,6 +1024,116 @@ skillsCmd closeDatabase(); }); +const workflowCmd = program + .command('workflow') + .description('Manage linear workflow skills'); + +workflowCmd + .command('create') + .description('Create a workflow skill from existing active read-only skills') + .requiredOption('--site ', 'Site ID that owns the workflow') + .requiredOption('--name ', 'Workflow name') + .requiredOption('--spec ', 'WorkflowSpec JSON') + .option('--description ', 'Workflow description') + .option('--output-transform ', 'Optional final output transform JSON') + .action(async (options: { site: string; name: string; spec: string; description?: string; outputTransform?: string }) => { + const config = getConfig(); + createLogger(config.logLevel); + ensureDirectories(config); + + const db = getDatabase(config); + const siteRepo = new SiteRepository(db); + if (!siteRepo.getById(options.site)) { + console.error(`Site '${options.site}' not found.`); + closeDatabase(); + process.exit(1); + } + + let workflowSpec: SkillSpec['workflowSpec']; + let outputTransform: SkillSpec['outputTransform']; + try { + workflowSpec = parseJsonInput('workflow spec', options.spec); + outputTransform = options.outputTransform + ? parseJsonInput('output transform', options.outputTransform) + : undefined; + } catch (err) { + console.error(err instanceof Error ? err.message : String(err)); + closeDatabase(); + process.exit(1); + } + + const { generateWorkflowSkill } = await import('./skill/generator.js'); + const workflowSkill = generateWorkflowSkill(options.site, options.name, workflowSpec!, { + description: options.description, + outputTransform, + }); + + const skillRepo = new SkillRepository(db); + if (skillRepo.getById(workflowSkill.id)) { + console.error(`Skill '${workflowSkill.id}' already exists.`); + closeDatabase(); + process.exit(1); + } + + const validation = validateImportableSkill(workflowSkill); + if (!validation.valid) { + console.error(`Error: ${validation.errors.join('; ')}`); + closeDatabase(); + process.exit(1); + } + + skillRepo.create(workflowSkill); + outputResult({ + created: true, + skillId: workflowSkill.id, + workflowSpec: workflowSkill.workflowSpec, + }); + closeDatabase(); + }); + +workflowCmd + .command('run ') + .description('Run a workflow skill locally') + .option('--params ', 'Initial params as JSON') + .action(async (skillId: string, options: { params?: string }) => { + let params: Record = {}; + if (options.params) { + try { + params = parseJsonInput>('params', options.params); + } catch (err) { + console.error(err instanceof Error ? err.message : String(err)); + process.exit(1); + } + } + + const config = getConfig(); + createLogger(config.logLevel); + ensureDirectories(config); + + const db = getDatabase(config); + const skillRepo = new SkillRepository(db); + const skill = skillRepo.getById(skillId); + if (!skill) { + console.error(`Skill '${skillId}' not found.`); + closeDatabase(); + process.exit(1); + } + if (!skill.workflowSpec) { + console.error(`Skill '${skillId}' is not a workflow.`); + closeDatabase(); + process.exit(1); + } + + const engine = new Engine(config); + try { + const result = await engine.executeSkill(skill.id, params); + outputResult(result); + } finally { + await engine.close(); + closeDatabase(); + } + }); + // ─── execute ──────────────────────────────────────────────────── program @@ -753,7 +1142,9 @@ program .addHelpText('after', '\n(requires local daemon or --url)') .option('--yes', 'Auto-confirm and permanently approve the skill') .option('--json', 'Output as JSON') - .action(async (skillId: string, paramPairs: string[], options: { yes?: boolean; json?: boolean }) => { + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (skillId: string, paramPairs: string[], options: { yes?: boolean; json?: boolean; url?: string; token?: string }) => { // Parse key=value params const params: Record = {}; for (const pair of paramPairs) { @@ -769,11 +1160,12 @@ program params[key] = value; } - const remote = getRemoteClient(); + const remote = getRemoteClient(options); if (remote) { // Remote mode + const stop = startProgress('Executing'); try { - let result = await remote.executeSkill(skillId, params); + let result = await executeWithRetryRemote(() => remote.executeSkill(skillId, params)); // Handle confirmation flow const data = result as Record; if (data.status === 'confirmation_required') { @@ -787,20 +1179,23 @@ program // Auto-confirm await remote.confirm(data.confirmationToken as string, true); console.log(`Skill '${skillId}' permanently approved for execution.`); - result = await remote.executeSkill(skillId, params); + result = await executeWithRetryRemote(() => remote.executeSkill(skillId, params)); } outputResult(result); } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); } return; } // Local mode — go through daemon + const stop = startProgress('Executing'); try { const daemonClient = createDaemonClient(getConfig()); - let result = await daemonClient.request('POST', '/ctl/execute', { skillId, params }); + let result = await executeWithRetry(() => daemonClient.request('POST', '/ctl/execute', { skillId, params })); const data = result as Record; if (data.status === 'confirmation_required') { if (!options.yes) { @@ -812,12 +1207,14 @@ program } await daemonClient.request('POST', '/ctl/confirm', { token: data.confirmationToken, approve: true }); console.log(`Skill '${skillId}' permanently approved for execution.`); - result = await daemonClient.request('POST', '/ctl/execute', { skillId, params }); + result = await executeWithRetry(() => daemonClient.request('POST', '/ctl/execute', { skillId, params })); } outputResult(result); } catch (err) { console.error('Error:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); } }); @@ -834,8 +1231,10 @@ const sitesCmd = program sitesCmd .command('list') .description('List all known sites') - .action(async () => { - const remote = getRemoteClient(); + .option('--url ', 'Remote Schrute server URL') + .option('--token ', 'Auth token for remote server') + .action(async (cmdOpts: { url?: string; token?: string }) => { + const remote = getRemoteClient(cmdOpts); if (remote) { try { const result = await remote.listSites(); @@ -1098,7 +1497,8 @@ configCmd configCmd .command('get ') .description('Get a configuration value') - .action((key: string) => { + .option('--reveal', 'Show sensitive values without masking') + .action((key: string, options: { reveal?: boolean }) => { const config = getConfig(); const keys = key.split('.'); let current: unknown = config; @@ -1112,9 +1512,11 @@ configCmd } } - // Mask sensitive leaf values - const lastKey = keys[keys.length - 1]; - current = maskConfigValue(lastKey, current); + // Mask sensitive leaf values unless --reveal + if (!options.reveal) { + const lastKey = keys[keys.length - 1]; + current = maskConfigValue(lastKey, current); + } console.log(JSON.stringify(current, null, 2)); }); @@ -1139,6 +1541,7 @@ program .option('--http', 'Enable HTTP transport (REST + MCP HTTP)') .option('--port ', 'Port number for HTTP server', '3000') .option('--no-daemon', 'Skip starting the daemon control socket') + .addHelpText('after', '\nTo run multiple instances, set SCHRUTE_DATA_DIR to different directories.') .action(async (options: { http?: boolean; port?: string; daemon?: boolean }) => { const config = getConfig(); createLogger(config.logLevel); @@ -1205,6 +1608,14 @@ program const mcpHttpDeps = { ...deps, config: { ...config, server: { ...config.server, network: true } } }; const mcpHttp = await startMcpHttpServer(mcpHttpDeps, { host, port: port + 1 }); console.log(` MCP HTTP: http://${host}:${port + 1}/mcp`); + const masked = config.server.authToken + ? `${config.server.authToken.slice(0, 4)}***` + : '(not set)'; + console.log(` Auth token: ${masked} (full value: schrute config get server.authToken --reveal)`); + if (!config.server.network) { + console.log(` REST API: no auth (local mode)`); + console.log(` MCP HTTP: requires Bearer token`); + } if (daemon) { const transport = daemon.transport; if (transport.mode === 'uds') { @@ -1321,164 +1732,41 @@ program program .command('import ') .description('Import skills + manifest + policy from JSON bundle') - .action((file: string) => { + .option('--yes', 'Skip confirmation') + .action(async (file: string, options: { yes?: boolean }) => { const config = getConfig(); createLogger(config.logLevel); ensureDirectories(config); - if (!fs.existsSync(file)) { - console.error(`File '${file}' not found.`); - process.exit(1); - } - - let bundle: { - version: string; - site: SiteManifest; - skills: SkillSpec[]; - policy?: SitePolicy; - }; - - try { - const raw = fs.readFileSync(file, 'utf-8'); - bundle = JSON.parse(raw); - } catch (err) { - console.error('Failed to parse bundle:', err instanceof Error ? err.message : String(err)); - process.exit(1); - } - - // Validate basic structure - if (!bundle.site || !bundle.skills || !Array.isArray(bundle.skills)) { - console.error('Invalid bundle format: missing site or skills.'); - process.exit(1); - } - - // ── Validate site before touching DB ────────────────────────── - const siteResult = validateImportableSite(bundle.site); - if (!siteResult.valid) { - console.error(`Site validation failed:\n ${siteResult.errors.join('\n ')}`); - process.exit(1); - } - - // ── Validate each skill; warn + skip invalid ones ───────────── - const validSkills: typeof bundle.skills = []; - const expectedSiteId = bundle.site.id; - - for (const skill of bundle.skills) { - const skillResult = validateImportableSkill(skill); - if (!skillResult.valid) { - const label = (skill as unknown as Record).id ?? '(unknown)'; - console.warn( - `Warning: skill '${label}' failed validation — skipping.\n ${skillResult.errors.join('\n ')}`, - ); - continue; - } - - // Preflight: empty allowedDomains - if (Array.isArray(skill.allowedDomains) && skill.allowedDomains.length === 0) { - console.warn( - `Warning: skill '${skill.id}' has no allowedDomains — may not execute without a domain policy.`, - ); - } - - // Preflight: siteId consistency - if (skill.siteId !== expectedSiteId) { - console.warn( - `Warning: skill '${skill.id}' has siteId '${skill.siteId}', expected '${expectedSiteId}'. Skipping.`, - ); - continue; - } - - validSkills.push(skill); - } - - // ── Open DB ─────────────────────────────────────────────────── const db = getDatabase(config); const skillRepo = new SkillRepository(db); const siteRepo = new SiteRepository(db); - // Import site manifest (wrap getById in try-catch for corrupt rows) - let existingSite: SiteManifest | undefined; try { - existingSite = siteRepo.getById(bundle.site.id); - } catch (err) { - console.warn( - `Warning: existing site '${bundle.site.id}' has corrupt data — will overwrite.`, - ); - existingSite = undefined; - } + const { performImport } = await import('./app/import-service.js'); + const result = await performImport(file, { db, skillRepo, siteRepo, config }, options); - if (existingSite) { - siteRepo.update(bundle.site.id, bundle.site); - console.log(`Updated existing site '${bundle.site.id}'.`); - } else { - // If the row exists but was corrupt, delete it first to avoid INSERT OR IGNORE keeping the old row - try { siteRepo.delete(bundle.site.id); } catch (_err) { /* row may not exist */ } - siteRepo.create(bundle.site); - console.log(`Created site '${bundle.site.id}'.`); - } - - // Import valid skills — ensure required NOT NULL DB fields are populated with defaults. - // SkillRepository.create() passes all fields explicitly (no DB DEFAULT fallback), - // so every NOT NULL column needs a value. - const now = Date.now(); - for (const skill of validSkills) { - // name is NOT NULL — derive from id (format: "site_id.skill_name.vN") - if (!skill.name) { - const parts = skill.id.split('.'); - skill.name = parts.length >= 2 ? parts[parts.length - 2] : skill.id; - } - if (skill.inputSchema === undefined) skill.inputSchema = {}; - if (skill.sideEffectClass === undefined) skill.sideEffectClass = 'read-only'; - if (skill.currentTier === undefined) skill.currentTier = 'tier_3'; - if (skill.status === undefined) skill.status = 'draft'; - if (skill.confidence === undefined) skill.confidence = 0; - if (skill.consecutiveValidations === undefined) skill.consecutiveValidations = 0; - if (skill.sampleCount === undefined) skill.sampleCount = 0; - if (skill.successRate === undefined) skill.successRate = 0; - if (skill.version === undefined) skill.version = 1; - if (skill.allowedDomains === undefined) skill.allowedDomains = []; - if (skill.isComposite === undefined) skill.isComposite = false; - if (skill.directCanaryEligible === undefined) skill.directCanaryEligible = false; - if (skill.directCanaryAttempts === undefined) skill.directCanaryAttempts = 0; - if (skill.validationsSinceLastCanary === undefined) skill.validationsSinceLastCanary = 0; - if (skill.createdAt === undefined) skill.createdAt = now; - if (skill.updatedAt === undefined) skill.updatedAt = now; - } - let created = 0; - let updated = 0; - for (const skill of validSkills) { - let existingSkill: SkillSpec | undefined; - try { - existingSkill = skillRepo.getById(skill.id); - } catch (err) { - console.warn( - `Warning: existing skill '${skill.id}' has corrupt data — will overwrite.`, - ); - existingSkill = undefined; + if (result.cancelled) { + console.log('Cancelled.'); + return; } - if (existingSkill) { - skillRepo.update(skill.id, skill); - updated++; - } else { - // If the row exists but was corrupt, delete it first - try { skillRepo.delete(skill.id); } catch (_err) { /* row may not exist */ } - skillRepo.create(skill); - created++; + if (result.siteAction) { + console.log(`${result.siteAction === 'created' ? 'Created' : 'Updated'} site.`); } + if (result.updated > 0) { + console.log(`Will overwrite ${result.updated} existing skill(s).`); + } + console.log(`Imported ${result.created} new skill(s), updated ${result.updated} existing skill(s).`); + if (result.hasAuthSkills) { + console.log('NOTE: Re-authentication may be required -- credentials are never exported.'); + } + } catch (err) { + console.error('Error:', err instanceof Error ? err.message : String(err)); + process.exit(1); + } finally { + closeDatabase(); } - - if (updated > 0) { - console.log(`Will overwrite ${updated} existing skill(s).`); - } - - console.log(`Imported ${created} new skill(s), updated ${updated} existing skill(s).`); - const hasAuthSkills = validSkills.some((s: SkillSpec) => s.authType != null); - if (hasAuthSkills) { - console.log('NOTE: Re-authentication may be required — credentials are never exported.'); - } - - closeDatabase(); }); // ─── discover ─────────────────────────────────────────────────── @@ -1491,8 +1779,7 @@ program createLogger(config.logLevel); ensureDirectories(config); - console.log(`Discovering APIs at ${url}...`); - + const stop = startProgress('Discovering'); try { const { discoverSite } = await import('./discovery/cold-start.js'); const result = await discoverSite(url, config); @@ -1520,6 +1807,8 @@ program } catch (err) { console.error('Discovery failed:', err instanceof Error ? err.message : String(err)); process.exit(1); + } finally { + stop(); } }); diff --git a/src/native/noise-filter.ts b/src/native/noise-filter.ts index d8294f4..9c1cdf0 100644 --- a/src/native/noise-filter.ts +++ b/src/native/noise-filter.ts @@ -39,6 +39,7 @@ export function filterRequestsNative( // Map indices back to entries return { signal: parsed.signalIndices.map(i => entries[i]), + htmlDocument: [], noise: parsed.noiseIndices.map(i => entries[i]), ambiguous: parsed.ambiguousIndices.map(i => entries[i]), }; diff --git a/src/replay/dry-run.ts b/src/replay/dry-run.ts index df5424a..1701b43 100644 --- a/src/replay/dry-run.ts +++ b/src/replay/dry-run.ts @@ -7,6 +7,7 @@ import type { FieldVolatility, } from '../skill/types.js'; import { buildRequest } from './request-builder.js'; +import { formatPermanentTierLockReason } from '../core/tiering.js'; import { redactHeaders as canonicalRedactHeaders, redactBody as canonicalRedactBody, @@ -26,6 +27,8 @@ interface DryRunResult { tier: ExecutionTierName; volatilityReport?: FieldVolatility[]; tierDecision?: string; + outputTransform?: SkillSpec['outputTransform']; + responseContentType?: string; } // ─── Dry Run ──────────────────────────────────────────────────── @@ -81,6 +84,8 @@ export async function dryRun( redactionsApplied, }, tier, + outputTransform: skill.outputTransform, + responseContentType: skill.responseContentType, }; // developer-debug mode includes extra info @@ -126,7 +131,7 @@ function describeTierDecision(skill: SkillSpec): string { if (skill.tierLock) { parts.push(`tierLock.type=${skill.tierLock.type}`); if (skill.tierLock.type === 'permanent') { - parts.push(`tierLock.reason=${skill.tierLock.reason}`); + parts.push(`tierLock.reason=${formatPermanentTierLockReason(skill.tierLock.reason)}`); } } else { parts.push('tierLock=none'); diff --git a/src/replay/executor.ts b/src/replay/executor.ts index 5526d98..c4e1920 100644 --- a/src/replay/executor.ts +++ b/src/replay/executor.ts @@ -27,6 +27,7 @@ import { parseResponse } from './response-parser.js'; import { checkSemanticNative as checkSemantic } from '../native/semantic-diff.js'; import { AuditLog } from './audit-log.js'; import { ToolBudgetTracker } from './tool-budget.js'; +import { isCloudflareChallengeSignal } from '../shared/cloudflare-challenge.js'; import { checkCapability, enforceDomainAllowlist, @@ -333,6 +334,7 @@ async function executeTier( schemaMatch: false, semanticPass: false, failureCause: FailureCause.FETCH_ERROR, + failureDetail: err instanceof Error ? err.message : String(err), }; } @@ -507,6 +509,10 @@ async function classifyFailure( // Without metrics data or insufficient history, fall through to UNKNOWN } + if (isCloudflareChallengeResponse(response)) { + return FailureCause.CLOUDFLARE_CHALLENGE; + } + // 2.5: Server errors (5xx) — map to UNKNOWN with better logging if (response.status >= 500 && response.status < 600) { log.info({ skillId: skill.id, status: response.status }, 'Server error (5xx) — classified as UNKNOWN'); @@ -574,6 +580,13 @@ async function classifyFailure( return FailureCause.UNKNOWN; } +function isCloudflareChallengeResponse(response: SealedFetchResponse): boolean { + return isCloudflareChallengeSignal({ + headers: response.headers, + content: response.body, + }); +} + // ─── Fetch Implementations ────────────────────────────────────── async function directFetch( @@ -808,6 +821,9 @@ async function fullBrowserExecution( // ─── Helpers ──────────────────────────────────────────────────── function determineTier(skill: SkillSpec): ExecutionTierName { + if (skill.tierLock?.type === 'permanent' && skill.tierLock.reason === 'browser_required') { + return ExecutionTier.BROWSER_PROXIED; + } const tierState = getEffectiveTier(skill); return tierState === 'tier_1' ? ExecutionTier.DIRECT : ExecutionTier.BROWSER_PROXIED; } @@ -866,4 +882,3 @@ async function redactPathTemplate(pathTemplate: string): Promise { const nativeResult = salt ? redactNative(pathTemplate, salt) : null; return nativeResult != null ? String(nativeResult) : redactString(pathTemplate); } - diff --git a/src/replay/regex-transform-worker.ts b/src/replay/regex-transform-worker.ts new file mode 100644 index 0000000..2b62c85 --- /dev/null +++ b/src/replay/regex-transform-worker.ts @@ -0,0 +1,50 @@ +import { parentPort, workerData } from 'node:worker_threads'; + +interface RegexWorkerData { + input: string; + expression: string; + flags?: string; +} + +function projectRegexMatch(match: RegExpExecArray): unknown { + const namedCaptures = match.groups ? { ...match.groups } : undefined; + if (namedCaptures && Object.keys(namedCaptures).length > 0) { + const positionalCaptures = match.slice(1); + if (positionalCaptures.length > Object.keys(namedCaptures).length) { + return { + ...namedCaptures, + $captures: positionalCaptures, + }; + } + return namedCaptures; + } + + if (match.length <= 1) { + return match[0]; + } + const captures = match.slice(1); + return captures.length === 1 ? captures[0] : captures; +} + +function runRegexTransform({ input, expression, flags }: RegexWorkerData): unknown { + const regex = new RegExp(expression, flags); + if (regex.global) { + return Array.from(input.matchAll(regex)).map(projectRegexMatch); + } + + const match = regex.exec(input); + if (!match) { + return undefined; + } + return projectRegexMatch(match); +} + +try { + const result = runRegexTransform(workerData as RegexWorkerData); + parentPort?.postMessage({ ok: true, result }); +} catch (error) { + parentPort?.postMessage({ + ok: false, + error: error instanceof Error ? error.message : String(error), + }); +} diff --git a/src/replay/response-parser.ts b/src/replay/response-parser.ts index 6879542..700f1e6 100644 --- a/src/replay/response-parser.ts +++ b/src/replay/response-parser.ts @@ -26,14 +26,30 @@ export function parseResponse( skill: SkillSpec, ): ParsedResponse { const errors: ResponseError[] = []; + const contentType = (response.headers['content-type'] ?? skill.responseContentType ?? '').toLowerCase(); + const isHtmlResponse = contentType.includes('text/html') || contentType.includes('application/xhtml+xml'); // Try parsing the body let data: unknown; - try { - data = JSON.parse(response.body); - } catch { + if (isHtmlResponse) { data = response.body; - if (skill.outputSchema && Object.keys(skill.outputSchema).length > 0) { + } else { + try { + data = JSON.parse(response.body); + } catch { + data = response.body; + if (skill.outputSchema && Object.keys(skill.outputSchema).length > 0) { + errors.push({ + type: 'parse_error', + message: 'Response body is not valid JSON', + }); + } + } + } + + if (!isHtmlResponse && typeof data === 'string' && skill.outputSchema && Object.keys(skill.outputSchema).length > 0) { + const hasParseError = errors.some(error => error.type === 'parse_error'); + if (!hasParseError) { errors.push({ type: 'parse_error', message: 'Response body is not valid JSON', @@ -76,4 +92,3 @@ export function parseResponse( return { data, schemaMatch, errors }; } - diff --git a/src/replay/retry.ts b/src/replay/retry.ts index 4636baf..993fb64 100644 --- a/src/replay/retry.ts +++ b/src/replay/retry.ts @@ -20,6 +20,8 @@ export interface RetryOptions extends ExecutorOptions { maxRetries?: number; /** Site-recommended tier passed from engine (which owns siteRepo) */ siteRecommendedTier?: ExecutionTierName; + /** Site policy can suppress all automatic direct probes */ + directAllowed?: boolean; /** WS-4: Force execution to start at this tier (canary probe support) */ forceStartTier?: ExecutionTierName; /** WS-4: Marks this execution as a canary probe */ @@ -62,12 +64,12 @@ export async function retryWithEscalation( ); const result = await executeSkill(skill, params, options); - const startTier = options?.forceTier ?? (getEffectiveTier(skill) === 'tier_1' ? ExecutionTier.DIRECT : ExecutionTier.BROWSER_PROXIED); + const startTier = options?.forceTier ?? determineStartingTier(skill); return { ...result, retryDecisions: [], startingTier: startTier, stepResults: [] }; } // Determine tier cascade based on skill - const tierCascade = buildTierCascade(skill, options?.siteRecommendedTier); + const tierCascade = buildTierCascade(skill, options?.siteRecommendedTier, options?.directAllowed ?? true); let currentTierIndex = 0; // WS-4: Honor forceStartTier — start from its position in cascade, or prepend it @@ -212,6 +214,13 @@ function decideRetry( return decision('abort', currentTier, 'Fetch error — all tiers exhausted'); } + if (cause === FailureCause.CLOUDFLARE_CHALLENGE) { + if (canEscalate) { + return decision('escalate', tierCascade[nextTierIndex], 'Cloudflare challenge — escalating tier'); + } + return decision('abort', currentTier, 'Cloudflare challenge — all tiers exhausted'); + } + // JS computed field, protocol sensitivity, signed payload: escalate tier if ( cause === FailureCause.JS_COMPUTED_FIELD || @@ -252,21 +261,44 @@ function decideRetry( // ─── Tier Cascade ─────────────────────────────────────────────── -function buildTierCascade(skill: SkillSpec, siteRecommendedTier?: ExecutionTierName): ExecutionTierName[] { +function buildTierCascade( + skill: SkillSpec, + siteRecommendedTier?: ExecutionTierName, + directAllowed: boolean = true, +): ExecutionTierName[] { + if (isBrowserRequiredSkill(skill)) { + return [ExecutionTier.BROWSER_PROXIED, ExecutionTier.FULL_BROWSER]; + } if (skill.tierLock?.type === 'permanent' || skill.tierLock?.type === 'temporary_demotion') { return [ExecutionTier.BROWSER_PROXIED, ExecutionTier.FULL_BROWSER]; } const effectiveTier = getEffectiveTier(skill); - if (effectiveTier === 'tier_1') { + if (effectiveTier === 'tier_1' && directAllowed) { return [ExecutionTier.DIRECT, ExecutionTier.BROWSER_PROXIED, ExecutionTier.FULL_BROWSER]; } + if (effectiveTier === 'tier_1') { + return [ExecutionTier.BROWSER_PROXIED, ExecutionTier.FULL_BROWSER]; + } // tier_3 but site recommends direct — try direct first, fall back to browser - if (siteRecommendedTier === ExecutionTier.DIRECT) { + if (siteRecommendedTier === ExecutionTier.DIRECT && directAllowed) { return [ExecutionTier.DIRECT, ExecutionTier.BROWSER_PROXIED, ExecutionTier.FULL_BROWSER]; } return [ExecutionTier.BROWSER_PROXIED, ExecutionTier.FULL_BROWSER]; } +function determineStartingTier(skill: SkillSpec): ExecutionTierName { + if (isBrowserRequiredSkill(skill)) { + return ExecutionTier.BROWSER_PROXIED; + } + return getEffectiveTier(skill) === 'tier_1' + ? ExecutionTier.DIRECT + : ExecutionTier.BROWSER_PROXIED; +} + +function isBrowserRequiredSkill(skill: SkillSpec): boolean { + return skill.tierLock?.type === 'permanent' && skill.tierLock.reason === 'browser_required'; +} + // ─── Helpers ──────────────────────────────────────────────────── function sleep(ms: number): Promise { diff --git a/src/replay/transform.ts b/src/replay/transform.ts new file mode 100644 index 0000000..d018b9d --- /dev/null +++ b/src/replay/transform.ts @@ -0,0 +1,326 @@ +import { Worker } from 'node:worker_threads'; +import { JSONPath } from 'jsonpath-plus'; +import { load } from 'cheerio'; +import type { OutputTransform } from '../skill/types.js'; + +export interface AppliedTransformResult { + data: unknown; + rawData?: unknown; + transformApplied: boolean; + label?: string; +} + +const VALID_REGEX_FLAGS = new Set(['d', 'g', 'i', 'm', 's', 'u', 'v', 'y']); +const MAX_REGEX_EXPRESSION_LENGTH = 512; +const MAX_REGEX_INPUT_LENGTH = 100_000; +const REGEX_WORKER_TIMEOUT_MS = 2_000; +const REGEX_WORKER_SOURCE = String.raw` +const { parentPort, workerData } = require('node:worker_threads'); + +function projectRegexMatch(match) { + const namedCaptures = match.groups ? { ...match.groups } : undefined; + if (namedCaptures && Object.keys(namedCaptures).length > 0) { + const positionalCaptures = match.slice(1); + if (positionalCaptures.length > Object.keys(namedCaptures).length) { + return { + ...namedCaptures, + $captures: positionalCaptures, + }; + } + return namedCaptures; + } + + if (match.length <= 1) { + return match[0]; + } + + const captures = match.slice(1); + return captures.length === 1 ? captures[0] : captures; +} + +function runRegexTransform({ input, expression, flags }) { + const regex = new RegExp(expression, flags); + if (regex.global) { + return Array.from(input.matchAll(regex)).map(projectRegexMatch); + } + + const match = regex.exec(input); + return match ? projectRegexMatch(match) : undefined; +} + +try { + parentPort.postMessage({ ok: true, result: runRegexTransform(workerData) }); +} catch (error) { + parentPort.postMessage({ + ok: false, + error: error instanceof Error ? error.message : String(error), + }); +} +`; + +interface RegexWorkerMessage { + ok: boolean; + result?: unknown; + error?: string; +} + +export async function applyTransform(data: unknown, transform?: OutputTransform): Promise { + if (!transform) { + return { + data, + transformApplied: false, + }; + } + + let transformed: unknown; + switch (transform.type) { + case 'jsonpath': + transformed = applyJsonPathTransform(data, transform.expression); + break; + case 'regex': + transformed = await applyRegexTransform(data, transform.expression, transform.flags); + break; + case 'css': + transformed = applyCssTransform(data, transform); + break; + default: + transformed = data; + break; + } + + return { + data: transformed, + transformApplied: true, + label: transform.label ?? defaultTransformLabel(transform), + }; +} + +function applyJsonPathTransform(data: unknown, expression: string): unknown { + const json = typeof data === 'string' ? tryParseJson(data) : data; + return JSONPath({ + path: expression, + json: (json ?? null) as string | number | boolean | object | unknown[] | null, + wrap: false, + }); +} + +async function applyRegexTransform(data: unknown, expression: string, flags?: string): Promise { + const input = stringifyForTextTransform(data); + validateRegexTransform(expression, flags, input); + return runRegexTransformWorker(input, expression, flags); +} + +function validateRegexTransform(expression: string, flags: string | undefined, input: string): void { + if (expression.length > MAX_REGEX_EXPRESSION_LENGTH) { + throw new Error(`Regex transform expression exceeds ${MAX_REGEX_EXPRESSION_LENGTH} characters`); + } + if (input.length > MAX_REGEX_INPUT_LENGTH) { + throw new Error(`Regex transform input exceeds ${MAX_REGEX_INPUT_LENGTH} characters`); + } + if (!flags) { + return; + } + + const seenFlags = new Set(); + for (const flag of flags) { + if (!VALID_REGEX_FLAGS.has(flag)) { + throw new Error(`Invalid regex flag '${flag}'`); + } + if (seenFlags.has(flag)) { + throw new Error(`Duplicate regex flag '${flag}'`); + } + seenFlags.add(flag); + } +} + +async function runRegexTransformWorker(input: string, expression: string, flags?: string): Promise { + try { + return await runRegexInWorker(input, expression, flags); + } catch (err) { + const msg = err instanceof Error ? err.message : String(err); + // Worker bootstrap should be reliable, but keep a narrow fallback for runtime init failures. + if (msg.includes('ERR_WORKER_PATH') || msg.includes('ERR_WORKER_INIT_FAILED')) { + return runRegexInlineWithTimeout(input, expression, flags); + } + throw err; + } +} + +function runRegexInWorker(input: string, expression: string, flags?: string): Promise { + return new Promise((resolve, reject) => { + const worker = new Worker(REGEX_WORKER_SOURCE, { + eval: true, + execArgv: [], + workerData: { input, expression, flags }, + }); + let settled = false; + + const cleanup = (): void => { + clearTimeout(timer); + worker.removeListener('message', onMessage); + worker.removeListener('error', onError); + worker.removeListener('exit', onExit); + }; + + const finish = (handler: (value: unknown) => void, value: unknown): void => { + if (settled) return; + settled = true; + cleanup(); + handler(value); + }; + + const onMessage = (message: RegexWorkerMessage): void => { + void worker.terminate().catch(() => undefined); + if (message.ok) { + finish(resolve, message.result); + } else { + finish(reject, new Error(message.error ?? 'Regex transform failed')); + } + }; + + const onError = (error: Error): void => { + void worker.terminate().catch(() => undefined); + finish(reject, error); + }; + + const onExit = (code: number): void => { + if (!settled && code !== 0) { + finish(reject, new Error(`Regex transform worker exited with code ${code}`)); + } + }; + + const timer = setTimeout(() => { + void worker.terminate().catch(() => undefined); + finish(reject, new Error(`Regex transform timed out after ${REGEX_WORKER_TIMEOUT_MS}ms`)); + }, REGEX_WORKER_TIMEOUT_MS); + + worker.on('message', onMessage); + worker.once('error', onError); + worker.once('exit', onExit); + }); +} + +function runRegexInlineWithTimeout(input: string, expression: string, flags?: string): Promise { + return new Promise((resolve, reject) => { + const timer = setTimeout(() => { + reject(new Error(`Regex transform timed out after ${REGEX_WORKER_TIMEOUT_MS}ms`)); + }, REGEX_WORKER_TIMEOUT_MS); + + try { + const regex = new RegExp(expression, flags); + let result: unknown; + if (regex.global) { + result = Array.from(input.matchAll(regex)).map(projectRegexMatch); + } else { + const match = regex.exec(input); + result = match ? projectRegexMatch(match) : undefined; + } + clearTimeout(timer); + resolve(result); + } catch (err) { + clearTimeout(timer); + reject(err); + } + }); +} + +function projectRegexMatch(match: RegExpExecArray): unknown { + const namedCaptures = match.groups ? { ...match.groups } : undefined; + if (namedCaptures && Object.keys(namedCaptures).length > 0) { + const positionalCaptures = match.slice(1); + if (positionalCaptures.length > Object.keys(namedCaptures).length) { + return { + ...namedCaptures, + $captures: positionalCaptures, + }; + } + return namedCaptures; + } + + if (match.length <= 1) { + return match[0]; + } + const captures = match.slice(1); + return captures.length === 1 ? captures[0] : captures; +} + +function applyCssTransform( + data: unknown, + transform: Extract, +): unknown { + const input = stringifyForTextTransform(data); + const $ = load(input); + const nodes = $(transform.selector); + + switch (transform.mode ?? 'text') { + case 'html': { + const first = nodes.first(); + return first.length > 0 ? first.html() ?? undefined : undefined; + } + case 'attr': { + const first = nodes.first(); + return first.length > 0 ? first.attr(transform.attr ?? '') ?? undefined : undefined; + } + case 'list': + if (transform.fields) { + return nodes.toArray().map((node) => { + const scope = $(node); + const item: Record = {}; + for (const [fieldName, field] of Object.entries(transform.fields ?? {})) { + const target = scope.find(field.selector).first(); + item[fieldName] = extractCssValue(target, field.mode ?? 'text', field.attr); + } + return item; + }); + } + return nodes.toArray().map((node) => $(node).text().trim()); + case 'text': + default: { + const first = nodes.first(); + return first.length > 0 ? first.text().trim() : undefined; + } + } +} + +function extractCssValue( + node: ReturnType>, + mode: 'text' | 'attr' | 'html', + attr?: string, +): unknown { + if (mode === 'attr') { + return node.attr(attr ?? '') ?? undefined; + } + if (mode === 'html') { + return node.html() ?? undefined; + } + return node.text().trim(); +} + +function defaultTransformLabel(transform: OutputTransform): string { + switch (transform.type) { + case 'jsonpath': + return transform.expression; + case 'regex': + return transform.expression; + case 'css': + return transform.selector; + } +} + +function tryParseJson(input: string): unknown { + try { + return JSON.parse(input); + } catch { + return input; + } +} + +function stringifyForTextTransform(data: unknown): string { + if (typeof data === 'string') { + return data; + } + if (data === undefined || data === null) { + return ''; + } + return JSON.stringify(data); +} diff --git a/src/replay/workflow-executor.ts b/src/replay/workflow-executor.ts new file mode 100644 index 0000000..e8e4409 --- /dev/null +++ b/src/replay/workflow-executor.ts @@ -0,0 +1,382 @@ +import { stableStringify } from '../browser/manager.js'; +import type { SkillExecutionResult } from '../core/engine.js'; +import { SideEffectClass, SkillStatus, type WorkflowSpec, type SkillSpec } from '../skill/types.js'; +import type { SkillRepository } from '../storage/skill-repository.js'; +import { applyTransform } from './transform.js'; + +export interface WorkflowStepResult { + skillId: string; + name?: string; + success: boolean; + data?: unknown; + error?: string; + failureCause?: string; + latencyMs: number; +} + +export interface WorkflowSuccessResult { + success: true; + data: unknown; + stepResults: WorkflowStepResult[]; + totalLatencyMs: number; +} + +export interface WorkflowFailureResult { + success: false; + data: { steps: WorkflowStepResult[] }; + error: string; + failureCause?: string; + failedAtStep?: string; + stepResults: WorkflowStepResult[]; + totalLatencyMs: number; +} + +export type WorkflowResult = + | WorkflowSuccessResult + | WorkflowFailureResult + | (SkillExecutionResult & { status: 'browser_handoff_required' }); + +export interface WorkflowCacheEntry { + createdAt: number; + result: SkillExecutionResult; +} + +export interface WorkflowStepCacheStore { + get(key: string): WorkflowCacheEntry | undefined; + set(key: string, value: WorkflowCacheEntry): unknown; + delete(key: string): boolean; + entries(): IterableIterator<[string, WorkflowCacheEntry]>; +} + +const WORKFLOW_CACHE_MAX_ENTRY_AGE_MS = 24 * 60 * 60 * 1000; + +export async function executeWorkflow( + workflow: WorkflowSpec, + initialParams: Record, + executeStep: (skillId: string, params: Record) => Promise, + skillRepo: SkillRepository, + cacheStore: WorkflowStepCacheStore = new Map(), +): Promise { + const startedAt = Date.now(); + pruneExpiredWorkflowCache(cacheStore); + const preflight = validateWorkflowPreflight(workflow, skillRepo); + if (!preflight.valid) { + return { + success: false, + data: { steps: [] }, + error: preflight.error, + failedAtStep: preflight.failedAtStep, + stepResults: [], + totalLatencyMs: Date.now() - startedAt, + }; + } + + const namedSteps = new Map(); + const stepResults: WorkflowStepResult[] = []; + let previousResult: SkillExecutionResult | undefined; + + for (const step of workflow.steps) { + const stepName = step.name ?? step.skillId; + let stepParams: Record; + try { + stepParams = resolveStepParams(step.paramMapping, initialParams, previousResult, namedSteps); + } catch (err) { + return { + success: false, + data: { steps: stepResults }, + error: err instanceof Error ? err.message : String(err), + failedAtStep: stepName, + stepResults, + totalLatencyMs: Date.now() - startedAt, + }; + } + + const cacheKey = step.cache ? buildWorkflowCacheKey(step.skillId, stepParams) : undefined; + let rawStepResult = cacheKey && step.cache + ? getCachedWorkflowStepResult(cacheStore, cacheKey, step.cache.ttlMs) + : undefined; + + if (!rawStepResult) { + rawStepResult = await executeWorkflowStepWithRetry(step.skillId, stepParams, executeStep); + if (rawStepResult.status === 'browser_handoff_required') { + return rawStepResult as SkillExecutionResult & { status: 'browser_handoff_required' }; + } + if (cacheKey && step.cache && rawStepResult.success) { + cacheStore.set(cacheKey, { + createdAt: Date.now(), + result: rawStepResult, + }); + } + } + + const stepResult = await applyWorkflowStepTransform(rawStepResult, step.transform); + + stepResults.push({ + skillId: step.skillId, + name: step.name, + success: stepResult.success, + data: stepResult.data, + error: stepResult.error, + failureCause: stepResult.failureCause, + latencyMs: stepResult.latencyMs, + }); + + if (!stepResult.success) { + return { + success: false, + data: { steps: stepResults }, + error: stepResult.error ?? `Workflow step '${stepName}' failed`, + failureCause: stepResult.failureCause, + failedAtStep: stepName, + stepResults, + totalLatencyMs: Date.now() - startedAt, + }; + } + + previousResult = stepResult; + if (step.name) { + namedSteps.set(step.name, stepResult); + } + } + + return { + success: true, + data: previousResult?.data, + stepResults, + totalLatencyMs: Date.now() - startedAt, + }; +} + +function buildWorkflowCacheKey(skillId: string, params: Record): string { + return `${skillId}|${stableStringify(params)}`; +} + +async function executeWorkflowStepWithRetry( + skillId: string, + params: Record, + executeStep: (skillId: string, params: Record) => Promise, +): Promise { + let result = await executeStep(skillId, params); + if (result.status === 'browser_handoff_required' || result.success || result.failureCause !== 'rate_limited') { + return result; + } + + const waitMs = parseRateLimitRetryAfterMs(result.failureDetail); + await new Promise(resolve => setTimeout(resolve, waitMs + 50)); + result = await executeStep(skillId, params); + return result; +} + +function parseRateLimitRetryAfterMs(failureDetail?: string): number { + const parsed = Number.parseInt(failureDetail?.match(/(\d+)ms/)?.[1] ?? '1000', 10); + const retryAfterMs = Number.isFinite(parsed) ? parsed : 1000; + return Math.min(Math.max(retryAfterMs, 100), 30_000); +} + +function getCachedWorkflowStepResult( + cacheStore: WorkflowStepCacheStore, + cacheKey: string, + ttlMs: number, +): SkillExecutionResult | undefined { + const cached = cacheStore.get(cacheKey); + if (!cached) { + return undefined; + } + const ageMs = Date.now() - cached.createdAt; + if (ageMs > WORKFLOW_CACHE_MAX_ENTRY_AGE_MS) { + cacheStore.delete(cacheKey); + return undefined; + } + return ageMs <= ttlMs ? cached.result : undefined; +} + +function pruneExpiredWorkflowCache(cacheStore: WorkflowStepCacheStore): void { + const now = Date.now(); + for (const [key, entry] of cacheStore.entries()) { + if (now - entry.createdAt > WORKFLOW_CACHE_MAX_ENTRY_AGE_MS) { + cacheStore.delete(key); + } + } +} + +async function applyWorkflowStepTransform( + stepResult: SkillExecutionResult, + transform: WorkflowSpec['steps'][number]['transform'], +): Promise { + if (!stepResult.success || !transform) { + return stepResult; + } + + const transformed = await applyTransform(stepResult.data, transform); + return { + ...stepResult, + data: transformed.data, + ...(transformed.transformApplied ? { transformApplied: true, transformLabel: transformed.label } : {}), + }; +} + +function validateWorkflowPreflight( + workflow: WorkflowSpec, + skillRepo: SkillRepository, +): { valid: true } | { valid: false; error: string; failedAtStep?: string } { + const seenNames = new Set(); + let hasPreviousStep = false; + + for (const step of workflow.steps) { + const stepName = step.name ?? step.skillId; + for (const ref of Object.values(step.paramMapping ?? {})) { + const referenceError = validateWorkflowReference(ref, seenNames, hasPreviousStep); + if (referenceError) { + return { valid: false, error: referenceError, failedAtStep: stepName }; + } + } + + if (step.name) { + if (seenNames.has(step.name)) { + return { valid: false, error: `Duplicate workflow step name '${step.name}'`, failedAtStep: step.name }; + } + } + + const skill = skillRepo.getById(step.skillId); + if (!skill) { + return { valid: false, error: `Workflow step skill '${step.skillId}' not found`, failedAtStep: stepName }; + } + if (skill.status !== SkillStatus.ACTIVE) { + return { valid: false, error: `Workflow step '${stepName}' is not active (status: ${skill.status})`, failedAtStep: stepName }; + } + if (skill.sideEffectClass !== SideEffectClass.READ_ONLY) { + return { valid: false, error: `Workflow step '${stepName}' is not read-only`, failedAtStep: stepName }; + } + if (skill.workflowSpec) { + return { valid: false, error: `Workflow step '${stepName}' cannot reference another workflow`, failedAtStep: stepName }; + } + const upperMethod = skill.method.toUpperCase(); + if (upperMethod !== 'GET' && upperMethod !== 'HEAD') { + return { valid: false, error: `Workflow step '${stepName}' must use GET or HEAD`, failedAtStep: stepName }; + } + + if (step.name) { + seenNames.add(step.name); + } + hasPreviousStep = true; + } + + return { valid: true }; +} + +function validateWorkflowReference( + ref: string, + availableStepNames: Set, + hasPreviousStep: boolean, +): string | undefined { + if (ref === '$initial' || ref.startsWith('$initial.')) { + return undefined; + } + + if (ref === '$prev' || ref.startsWith('$prev.')) { + return hasPreviousStep + ? undefined + : `Workflow reference '${ref}' is invalid because there is no previous step result`; + } + + if (ref.startsWith('$steps.')) { + const remainder = ref.slice('$steps.'.length); + const stepName = remainder.split('.', 1)[0]; + if (!stepName) { + return `Workflow reference '${ref}' is missing a step name`; + } + if (!availableStepNames.has(stepName)) { + return `Workflow reference '${ref}' points to unknown step '${stepName}'`; + } + return undefined; + } + + return `Unsupported workflow reference '${ref}'`; +} + +function resolveStepParams( + paramMapping: Record | undefined, + initialParams: Record, + previousResult: SkillExecutionResult | undefined, + namedSteps: Map, +): Record { + const resolved: Record = {}; + if (!paramMapping) { + return resolved; + } + + for (const [key, ref] of Object.entries(paramMapping)) { + resolved[key] = resolveReference(ref, initialParams, previousResult, namedSteps); + } + + return resolved; +} + +function resolveReference( + ref: string, + initialParams: Record, + previousResult: SkillExecutionResult | undefined, + namedSteps: Map, +): unknown { + if (ref === '$initial' || ref.startsWith('$initial.')) { + return getPathValue(initialParams, ref.slice('$initial'.length), ref); + } + + if (ref === '$prev' || ref.startsWith('$prev.')) { + if (!previousResult) { + throw new Error(`Workflow reference '${ref}' is invalid because there is no previous step result`); + } + return getPathValue(previousResult, ref.slice('$prev'.length), ref); + } + + if (ref.startsWith('$steps.')) { + const remainder = ref.slice('$steps.'.length); + const stepName = remainder.split('.', 1)[0]; + if (!stepName) { + throw new Error(`Workflow reference '${ref}' is missing a step name`); + } + const stepResult = namedSteps.get(stepName); + if (!stepResult) { + throw new Error(`Workflow reference '${ref}' points to unknown step '${stepName}'`); + } + return getPathValue(stepResult, remainder.slice(stepName.length), ref); + } + + throw new Error(`Unsupported workflow reference '${ref}'`); +} + +function getPathValue(source: unknown, rawPath: string, ref: string): unknown { + if (!rawPath) { + return source; + } + + const normalized = rawPath.replace(/^\./, '').replace(/\[(\d+)\]/g, '.$1'); + const segments = normalized.split('.').filter(Boolean); + let current: unknown = source; + + for (const segment of segments) { + if (current === null || current === undefined) { + throw new Error(`Workflow reference '${ref}' resolved to undefined at '${segment}'`); + } + + if (Array.isArray(current)) { + const index = Number(segment); + if (!Number.isInteger(index) || index < 0 || index >= current.length) { + throw new Error(`Workflow reference '${ref}' has invalid array index '${segment}'`); + } + current = current[index]; + continue; + } + + if (typeof current !== 'object' || !(segment in (current as Record))) { + throw new Error(`Workflow reference '${ref}' resolved to undefined at '${segment}'`); + } + + current = (current as Record)[segment]; + } + + if (current === undefined) { + throw new Error(`Workflow reference '${ref}' resolved to undefined`); + } + return current; +} diff --git a/src/server/daemon.ts b/src/server/daemon.ts index 1ba0b0d..4a5ab88 100644 --- a/src/server/daemon.ts +++ b/src/server/daemon.ts @@ -307,7 +307,7 @@ async function handleRequest( if (method === 'GET' && url === '/ctl/sessions') { const msm = engine.getMultiSessionManager(); const activeName = msm.getActive(); - const sessions = msm.list().map(s => ({ + const sessions = msm.list(undefined, config, { includeInternal: false }).map(s => ({ name: s.name, siteId: s.siteId, isCdp: s.isCdp, diff --git a/src/server/mcp-http.ts b/src/server/mcp-http.ts index cda8364..71dbab4 100644 --- a/src/server/mcp-http.ts +++ b/src/server/mcp-http.ts @@ -74,14 +74,14 @@ export async function startMcpHttpServer( registerPromptHandlers(server, depsWithRouter); server.setRequestHandler(ListToolsRequestSchema, async (_request, extra) => { - const listCallerId = (extra as { sessionId?: string })?.sessionId ?? 'mcp-http-unknown'; + const listCallerId = `mcp-http:${(extra as { sessionId?: string })?.sessionId ?? 'unknown'}`; const tools = buildToolList(depsWithRouter, listCallerId); return { tools }; }); server.setRequestHandler(CallToolRequestSchema, async (request, extra) => { const { name, arguments: args } = request.params; - const toolCallerId = (extra as { sessionId?: string })?.sessionId ?? 'mcp-http-unknown'; + const toolCallerId = `mcp-http:${(extra as { sessionId?: string })?.sessionId ?? 'unknown'}`; return dispatchToolCall(name, args as Record | undefined, depsWithRouter, toolCallerId); }); } diff --git a/src/server/rest-server.ts b/src/server/rest-server.ts index ea79d17..2fe7bb8 100644 --- a/src/server/rest-server.ts +++ b/src/server/rest-server.ts @@ -174,6 +174,28 @@ export async function createRestServer(options?: { logger: false, }); + // ─── Empty Body Parser ────────────────────────────────── + app.removeContentTypeParser('application/json'); + app.addContentTypeParser('application/json', { parseAs: 'string' }, (req, body, done) => { + if (!body || (typeof body === 'string' && body.trim() === '')) { + done(null, {}); + } else { + try { done(null, JSON.parse(body as string)); } + catch (e) { done(e as Error, undefined); } + } + }); + + // ─── v0 Deprecation Header ───────────────────────────── + const DEPRECATION_EXEMPT = new Set(['/api/health', '/api/docs', '/api/openapi.json']); + + app.addHook('onSend', async (request, reply) => { + const url = request.url.split('?')[0]; + if (url.startsWith('/api/') && !url.startsWith('/api/v1/') && !DEPRECATION_EXEMPT.has(url)) { + reply.header('Deprecation', 'true'); + reply.header('Link', '; rel="successor-version"'); + } + }); + // ─── Bearer Token Auth ──────────────────────────────────── if (config.server.network && config.server.authToken) { app.addHook('onRequest', async (request, reply) => { @@ -365,7 +387,7 @@ export async function createRestServer(options?: { app.get('/api/sessions', async (_request, reply) => { if (!requireAdmin(config, reply)) return; const multiSession = engine.getMultiSessionManager(); - const sessions = multiSession.list().map(s => ({ + const sessions = multiSession.list(undefined, config, { includeInternal: false }).map(s => ({ name: s.name, siteId: s.siteId, isCdp: s.isCdp, @@ -606,6 +628,10 @@ export async function createRestServer(options?: { reply.code(202).send(outcome); return; } + if (outcome.status === 'browser_handoff_required') { + reply.code(202).send(outcome.result); + return; + } if (outcome.result.success) { reply.code(200).send(outcome.result); } else { @@ -795,6 +821,10 @@ export async function createRestServer(options?: { }, reqId)); return; } + if (outcome.status === 'browser_handoff_required') { + reply.code(202).send(apiResponse(outcome.result, reqId)); + return; + } // outcome.status === 'executed' if (outcome.result.success) { reply.code(200).send(apiResponse(outcome.result, reqId)); diff --git a/src/server/router.ts b/src/server/router.ts index 16d5eb1..2ccedc6 100644 --- a/src/server/router.ts +++ b/src/server/router.ts @@ -156,6 +156,14 @@ export function createRouter(deps: RouterDeps) { } const result = await engine.executeSkill(skill.id, params, callerId); + if (result.status === 'browser_handoff_required') { + return { + success: false, + error: result.hint ?? 'Browser handoff required', + statusCode: 202, + data: result, + }; + } if (result.success) { return { success: true, data: result }; } diff --git a/src/server/skill-helpers.ts b/src/server/skill-helpers.ts index 5b8434c..dbfe7d1 100644 --- a/src/server/skill-helpers.ts +++ b/src/server/skill-helpers.ts @@ -1,7 +1,7 @@ import type { SkillSpec } from '../skill/types.js'; import type { BrowserManager } from '../browser/manager.js'; import type { SkillRepository } from '../storage/skill-repository.js'; -import { getEffectiveTier } from '../core/tiering.js'; +import { getEffectiveTier, formatPermanentTierLockReason } from '../core/tiering.js'; import { TierState, SkillStatus } from '../skill/types.js'; import { rankToolsByIntent, skillToToolDefinition } from './tool-registry.js'; @@ -51,7 +51,7 @@ export function getSkillExecutability( function buildPromotionProgress(skill: SkillSpec): string | undefined { if (skill.currentTier === 'tier_1') return 'Promoted to direct'; - if (skill.tierLock?.type === 'permanent') return `Locked: ${skill.tierLock.reason}`; + if (skill.tierLock?.type === 'permanent') return `Locked: ${formatPermanentTierLockReason(skill.tierLock.reason)}`; if (skill.directCanaryEligible) return 'Ready for direct canary on next execution'; if ((skill.directCanaryAttempts ?? 0) > 0 && !skill.directCanaryEligible) { return `Canary failed (${skill.lastCanaryErrorType ?? 'unknown'}), ${skill.directCanaryAttempts} attempts`; @@ -145,7 +145,7 @@ export function searchAndProjectSkills( } } - const ranked = rankToolsByIntent(skills, query, limit); + const ranked = rankToolsByIntent(skills, query, limit, { preFiltered: !!matchType }); const results: SkillSearchResult[] = ranked.map(s => { const toolDef = skillToToolDefinition(s); const execInfo = getSkillExecutability(s, browserManager); diff --git a/src/server/tool-dispatch.ts b/src/server/tool-dispatch.ts index 4b6927a..06909a6 100644 --- a/src/server/tool-dispatch.ts +++ b/src/server/tool-dispatch.ts @@ -28,6 +28,7 @@ import { sanitizeSiteId } from '../core/utils.js'; import { setupCdpSitePolicy, validateProxyConfig, validateGeoConfig } from './shared-validation.js'; import { isAdminCaller } from '../shared/admin-auth.js'; import { getShapedStatus } from './status-response.js'; +import { validateImportableSkill } from '../storage/import-validator.js'; const log = getLogger(); @@ -36,7 +37,7 @@ const ADMIN_ONLY_TOOL_NAMES = new Set([ 'schrute_explore', 'schrute_record', 'schrute_stop', 'schrute_pipeline_status', 'schrute_import_cookies', 'schrute_export_cookies', 'schrute_connect_cdp', 'schrute_recover_explore', 'schrute_webmcp_call', - 'schrute_delete_skill', + 'schrute_delete_skill', 'schrute_set_transform', 'schrute_export_skill', 'schrute_create_workflow', ]); // ─── Shared Dependencies ──────────────────────────────────────── @@ -73,12 +74,11 @@ export interface ToolDefinition { // ─── Build Tool List ──────────────────────────────────────────── /** - * Build the list of available MCP tools, including meta tools, browser tools, - * and active skill tools (ranked and shortlisted). + * Build the list of available MCP tools, including active learned skills. * - * In multi-user mode (server.network=true), non-admin callers see only - * non-admin meta tools. They discover skills via schrute_search_skills - * and execute via schrute_execute. + * In multi-user mode, dynamic skill tools are still admin-scoped so non-admin + * callers discover skills via schrute_search_skills and invoke them via + * schrute_execute. */ export function buildToolList(deps: ToolDispatchDeps, callerId?: string): ToolDefinition[] { const { engine, skillRepo, config } = deps; @@ -104,9 +104,6 @@ export function buildToolList(deps: ToolDispatchDeps, callerId?: string): ToolDe } // 3. Dynamic skill tools — admin only in multi-user mode. - // Non-admin callers use schrute_search_skills (explicit siteId) - // + schrute_execute (skillId). Including dynamic skill tools for - // non-admin would couple their tool list to the admin's active session. if (isAdmin) { const multiSession = engine.getMultiSessionManager(); const activeName = multiSession.getActive(); @@ -189,7 +186,7 @@ async function executeSkillWithGating( type: 'text', text: JSON.stringify(result, null, 2), }], - ...(result.success === false ? { isError: true } : {}), + ...(result.success === false && result.status !== 'browser_handoff_required' ? { isError: true } : {}), }; } @@ -372,6 +369,124 @@ export async function dispatchToolCall( }; } + case 'schrute_set_transform': { + const skillId = args?.skillId as string; + if (!skillId) { + return { content: [{ type: 'text', text: 'Error: skillId is required' }], isError: true }; + } + const skill = skillRepo.getById(skillId); + if (!skill) { + return { content: [{ type: 'text', text: `Error: skill '${skillId}' not found` }], isError: true }; + } + + const clear = args?.clear === true; + const transform = args?.transform; + const responseContentType = args?.responseContentType as string | undefined; + + if (clear && transform !== undefined) { + return { content: [{ type: 'text', text: 'Error: clear cannot be combined with transform' }], isError: true }; + } + if (!clear && transform === undefined && responseContentType === undefined) { + return { content: [{ type: 'text', text: 'Error: provide transform, responseContentType, or clear=true' }], isError: true }; + } + + const candidate: SkillSpec = { + ...skill, + ...(clear ? { outputTransform: undefined } : {}), + ...(transform !== undefined ? { outputTransform: transform as SkillSpec['outputTransform'] } : {}), + ...(responseContentType !== undefined ? { responseContentType } : {}), + }; + const validation = validateImportableSkill(candidate); + if (!validation.valid) { + return { content: [{ type: 'text', text: `Error: ${validation.errors.join('; ')}` }], isError: true }; + } + + skillRepo.update(skill.id, { + ...(clear ? { outputTransform: null as unknown as SkillSpec['outputTransform'] } : {}), + ...(transform !== undefined ? { outputTransform: transform as SkillSpec['outputTransform'] } : {}), + ...(responseContentType !== undefined ? { responseContentType } : {}), + }); + + const updated = skillRepo.getById(skill.id); + return { + content: [{ + type: 'text', + text: JSON.stringify({ + updated: true, + skillId: skill.id, + outputTransform: updated?.outputTransform, + responseContentType: updated?.responseContentType, + }, null, 2), + }], + }; + } + + case 'schrute_export_skill': { + const skillId = args?.skillId as string; + const format = args?.format as 'curl' | 'fetch.ts' | 'requests.py' | 'playwright.ts'; + if (!skillId) { + return { content: [{ type: 'text', text: 'Error: skillId is required' }], isError: true }; + } + if (!format || !['curl', 'fetch.ts', 'requests.py', 'playwright.ts'].includes(format)) { + return { content: [{ type: 'text', text: 'Error: format must be one of curl, fetch.ts, requests.py, playwright.ts' }], isError: true }; + } + const skill = skillRepo.getById(skillId); + if (!skill) { + return { content: [{ type: 'text', text: `Error: skill '${skillId}' not found` }], isError: true }; + } + const { generateExport } = await import('../skill/generator.js'); + const code = generateExport(skill, format, (args?.params ?? undefined) as Record | undefined); + return { + content: [{ + type: 'text', + text: JSON.stringify({ skillId, format, code }, null, 2), + }], + }; + } + + case 'schrute_create_workflow': { + const siteId = args?.siteId as string; + const name = args?.name as string; + const workflowSpec = args?.workflowSpec as SkillSpec['workflowSpec']; + const description = args?.description as string | undefined; + const outputTransform = args?.outputTransform as SkillSpec['outputTransform']; + + if (!siteId || !name || !workflowSpec) { + return { content: [{ type: 'text', text: 'Error: siteId, name, and workflowSpec are required' }], isError: true }; + } + if (!siteRepo.getById(siteId)) { + return { content: [{ type: 'text', text: `Error: site '${siteId}' not found` }], isError: true }; + } + + const { generateWorkflowSkill } = await import('../skill/generator.js'); + const workflowSkill = generateWorkflowSkill(siteId, name, workflowSpec, { + description, + outputTransform, + }); + + if (skillRepo.getById(workflowSkill.id)) { + return { content: [{ type: 'text', text: `Error: skill '${workflowSkill.id}' already exists` }], isError: true }; + } + + const validation = validateImportableSkill(workflowSkill); + if (!validation.valid) { + return { content: [{ type: 'text', text: `Error: ${validation.errors.join('; ')}` }], isError: true }; + } + + skillRepo.create(workflowSkill); + return { + content: [{ + type: 'text', + text: JSON.stringify({ + created: true, + skillId: workflowSkill.id, + status: workflowSkill.status, + workflowSpec: workflowSkill.workflowSpec, + }, null, 2), + }], + }; + } + case 'schrute_confirm': { const confirmationToken = args?.confirmationToken as string; const approve = args?.approve as boolean; @@ -759,7 +874,7 @@ export async function dispatchToolCall( case 'schrute_sessions': { const multiSession = engine.getMultiSessionManager(); - const sessions = multiSession.list(callerId, config).map(s => ({ + const sessions = multiSession.list(callerId, config, { includeInternal: false }).map(s => ({ name: s.name, siteId: s.siteId, isCdp: s.isCdp, @@ -1064,20 +1179,7 @@ export async function dispatchToolCall( if (actions.length > 50) { return { content: [{ type: 'text', text: 'Error: max 50 actions per batch' }], isError: true }; } - const results: Array<{ skillId: string; success: boolean; data?: unknown; error?: string }> = []; - for (const action of actions) { - const skill = skillRepo.getById(action.skillId); - if (!skill) { - results.push({ skillId: action.skillId, success: false, error: 'Skill not found' }); - continue; - } - try { - const r = await engine.executeSkill(skill.id, action.params ?? {}, callerId); - results.push({ skillId: action.skillId, success: r.success, data: r.data, error: r.error }); - } catch (err) { - results.push({ skillId: action.skillId, success: false, error: err instanceof Error ? err.message : String(err) }); - } - } + const results = await engine.executeBatch(actions, callerId); return { content: [{ type: 'text', text: JSON.stringify({ batch: true, count: results.length, results }, null, 2) }] }; } } diff --git a/src/server/tool-registry.ts b/src/server/tool-registry.ts index 993a704..035708f 100644 --- a/src/server/tool-registry.ts +++ b/src/server/tool-registry.ts @@ -8,6 +8,7 @@ export function rankToolsByIntent( skills: SkillSpec[], intent: string | undefined, k: number, + opts?: { preFiltered?: boolean }, ): SkillSpec[] { if (!intent) { return skills.slice(0, k); @@ -22,7 +23,8 @@ export function rankToolsByIntent( const queryPathWords = words.filter(w => !HTTP_METHODS.has(w)); const scored = skills.map((skill) => { - let score = 0; + let relevance = 0; // lexical matches only + let quality = 0; // non-textual boosts (tie-breakers) const nameLower = (skill.name ?? '').toLowerCase(); const descLower = (skill.description ?? '').toLowerCase(); const idLower = skill.id.toLowerCase(); @@ -32,45 +34,49 @@ export function rankToolsByIntent( const pathSegments = pathLower.split('/').filter(Boolean); for (const word of words) { - if (nameLower.includes(word)) score += 3; - if (descLower.includes(word)) score += 2; - if (idLower.includes(word)) score += 1; - if (pathLower.includes(word)) score += 3; - if (siteIdLower.includes(word)) score += 1; + if (nameLower.includes(word)) relevance += 3; + if (descLower.includes(word)) relevance += 2; + if (idLower.includes(word)) relevance += 1; + if (pathLower.includes(word)) relevance += 3; + if (siteIdLower.includes(word)) relevance += 1; // Exact method match gets +2, substring match gets +1 if (methodLower === word) { - score += 2; + relevance += 2; } else if (methodLower.includes(word)) { - score += 1; + relevance += 1; } // Whole-path-segment bonus - if (pathSegments.includes(word)) score += 2; + if (pathSegments.includes(word)) relevance += 2; } // Method+path combo bonus: if query has e.g. "GET users", boost skills matching both if (queryMethod && queryPathWords.length > 0 && methodLower === queryMethod) { const pathMatch = queryPathWords.some(pw => pathSegments.includes(pw)); - if (pathMatch) score += 3; + if (pathMatch) relevance += 3; } // Boost by success rate and recency - score += skill.successRate * 2; + quality += skill.successRate * 2; if (skill.lastUsed) { const ageHours = (Date.now() - skill.lastUsed) / (1000 * 60 * 60); - if (ageHours < 24) score += 1; + if (ageHours < 24) quality += 1; } // Boost direct-proven skills - if (skill.currentTier === 'tier_1') score += 1; + if (skill.currentTier === 'tier_1') quality += 1; // Boost lower latency const avgLatencyMs = 'avgLatencyMs' in skill ? (skill as unknown as Record).avgLatencyMs : undefined; - if (typeof avgLatencyMs === 'number' && avgLatencyMs < 500) score += 1; + if (typeof avgLatencyMs === 'number' && avgLatencyMs < 500) quality += 1; - return { skill, score }; + return { skill, relevance, quality, score: relevance + quality }; }); - scored.sort((a, b) => b.score - a.score); - return scored.slice(0, k).map((s) => s.skill); + // When intent is provided and skills are NOT pre-filtered (e.g., by FTS), + // require at least one lexical match. Pre-filtered skills (from FTS with + // porter stemming) are already relevance-validated. + const candidates = opts?.preFiltered ? scored : scored.filter(s => s.relevance > 0); + candidates.sort((a, b) => b.score - a.score); + return candidates.slice(0, k).map((s) => s.skill); } // ─── Parameter Key Sanitization ───────────────────────────────── @@ -303,6 +309,74 @@ export const META_TOOLS = [ required: ['skillId'], }, }, + { + name: 'schrute_set_transform', + description: 'Set or clear an output transform for a skill', + inputSchema: { + type: 'object' as const, + properties: { + skillId: { type: 'string', description: 'Skill ID to update' }, + transform: { + type: 'object', + description: 'Output transform definition', + additionalProperties: true, + }, + responseContentType: { + type: 'string', + description: 'Optional response content type override, e.g. text/html', + }, + clear: { + type: 'boolean', + description: 'Clear the current transform', + }, + }, + required: ['skillId'], + }, + }, + { + name: 'schrute_export_skill', + description: 'Export a skill as standalone curl, fetch, Python, or Playwright code', + inputSchema: { + type: 'object' as const, + properties: { + skillId: { type: 'string', description: 'Skill ID to export' }, + format: { + type: 'string', + enum: ['curl', 'fetch.ts', 'requests.py', 'playwright.ts'], + description: 'Output format', + }, + params: { + type: 'object', + description: 'Optional params used to resolve the request URL, headers, and body', + additionalProperties: true, + }, + }, + required: ['skillId', 'format'], + }, + }, + { + name: 'schrute_create_workflow', + description: 'Create a read-only linear workflow skill from existing active GET/HEAD skills', + inputSchema: { + type: 'object' as const, + properties: { + siteId: { type: 'string', description: 'Site ID that owns the workflow' }, + name: { type: 'string', description: 'Workflow name' }, + description: { type: 'string', description: 'Optional workflow description' }, + workflowSpec: { + type: 'object', + description: 'Workflow specification with ordered steps', + additionalProperties: true, + }, + outputTransform: { + type: 'object', + description: 'Optional transform applied to the final workflow output', + additionalProperties: true, + }, + }, + required: ['siteId', 'name', 'workflowSpec'], + }, + }, { name: 'schrute_confirm', description: 'Confirm or deny first-run of a newly-active skill', @@ -541,7 +615,7 @@ export const META_TOOLS = [ }, { name: 'schrute_capture_recent', - description: 'Capture recent network activity and generate skills from it (no pre-recording needed)', + description: 'Capture recent network activity and generate skills from it. Requires a CDP-connected session (schrute_connect_cdp). Does not work with schrute_explore sessions.', inputSchema: { type: 'object' as const, properties: { @@ -723,6 +797,25 @@ export function getBrowserToolDefinitions() { }; } + // Special-case: explicit schema for browser_fill_form + if (name === 'browser_fill_form') { + return { + name, + description: 'Fill multiple form fields at once. Keys are field labels, input name attributes, or @e refs from browser_snapshot.', + inputSchema: { + type: 'object' as const, + properties: { + values: { + type: 'object' as const, + description: 'Map of field identifiers to values. Keys: label text, input name, or @eN ref.', + additionalProperties: { type: 'string' as const }, + }, + }, + required: ['values'] as const, + }, + }; + } + // Default: generic schema for all other browser tools return { name, diff --git a/src/shared/admin-auth.ts b/src/shared/admin-auth.ts index e009f7b..eaf26cc 100644 --- a/src/shared/admin-auth.ts +++ b/src/shared/admin-auth.ts @@ -14,5 +14,7 @@ import type { SchruteConfig } from '../skill/types.js'; export function isAdminCaller(callerId: string | undefined, config: SchruteConfig): boolean { if (!config.server.network) return true; // localhost-only: everyone is admin if (!callerId) return true; // no callerId = legacy/CLI = trusted - return callerId === 'stdio' || callerId === 'daemon'; + if (callerId === 'stdio' || callerId === 'daemon') return true; + if (config.server.mcpHttpAdmin && callerId.startsWith('mcp-http:')) return true; + return false; } diff --git a/src/shared/cloudflare-challenge.ts b/src/shared/cloudflare-challenge.ts new file mode 100644 index 0000000..3786955 --- /dev/null +++ b/src/shared/cloudflare-challenge.ts @@ -0,0 +1,44 @@ +export interface CloudflareChallengeSignals { + url?: string; + headers?: Record; + content?: string; +} + +function getHeader(headers: Record | undefined, name: string): string | undefined { + if (!headers) return undefined; + const target = name.toLowerCase(); + for (const [key, value] of Object.entries(headers)) { + if (key.toLowerCase() === target) return value; + } + return undefined; +} + +export function isCloudflareChallengeSignal(signals: CloudflareChallengeSignals): boolean { + const url = signals.url ?? ''; + const content = signals.content ?? ''; + const cfMitigated = getHeader(signals.headers, 'cf-mitigated')?.toLowerCase(); + if (cfMitigated === 'challenge') { + return true; + } + + const location = getHeader(signals.headers, 'location') ?? ''; + const hasCdnCgiPath = /\/cdn-cgi\/challenge-platform|\/cdn-cgi\//i.test(url) + || /\/cdn-cgi\/challenge-platform|\/cdn-cgi\//i.test(location) + || /\/cdn-cgi\/challenge-platform/i.test(content); + if (hasCdnCgiPath) { + return true; + } + + if (/__cf_chl_/i.test(content)) { + return true; + } + + const hasGenericChallengeText = /Just a moment|Verify(?:ing)? you are human|Checking your browser/i.test(content); + if (!hasGenericChallengeText) { + return false; + } + + const server = getHeader(signals.headers, 'server') ?? ''; + const hasCloudflareSupport = /cloudflare/i.test(server) || typeof getHeader(signals.headers, 'cf-ray') === 'string'; + return hasCloudflareSupport; +} diff --git a/src/skill/generator.ts b/src/skill/generator.ts index 2edd908..ac6e489 100644 --- a/src/skill/generator.ts +++ b/src/skill/generator.ts @@ -1,6 +1,7 @@ import { extractPathParams } from '../core/utils.js'; import { sanitizeParamKey } from '../server/tool-registry.js'; import { buildExecutionSchema } from '../replay/param-validator.js'; +import { buildRequest } from '../replay/request-builder.js'; import type { SkillSpec, SkillParameter, @@ -11,11 +12,14 @@ import type { TierStateName, CapabilityName, EvidenceReport, + WorkflowSpec, } from './types.js'; import { SkillStatus, TierState, Capability, + ExecutionTier, + SideEffectClass, } from './types.js'; import { classifySideEffect } from './side-effects.js'; import { scanSkill } from './security-scanner.js'; @@ -31,6 +35,7 @@ export interface ClusterInfo { description?: string; inputSchema: Record; outputSchema?: Record; + responseContentType?: string; requiredHeaders?: Record; dynamicHeaders?: Record; sampleCount: number; @@ -52,12 +57,14 @@ export function generateSkill( const version = 1; const id = buildSkillId(siteId, actionName, version, cluster.isGraphQL, cluster.graphqlOperationName); - const sideEffectClass = classifySideEffect( - cluster.method, - cluster.pathTemplate, - undefined, - cluster.requestBody, - ); + const sideEffectClass = cluster.responseContentType?.toLowerCase().includes('text/html') + ? SideEffectClass.READ_ONLY + : classifySideEffect( + cluster.method, + cluster.pathTemplate, + undefined, + cluster.requestBody, + ); const parameters = buildParameters(paramEvidence); const allowedDomains = cluster.canonicalHost && cluster.canonicalHost !== siteId @@ -94,6 +101,7 @@ export function generateSkill( pathTemplate: cluster.pathTemplate, inputSchema: cluster.inputSchema, outputSchema: cluster.outputSchema, + responseContentType: cluster.responseContentType, authType: authRecipe?.type, requiredHeaders: cluster.requiredHeaders, dynamicHeaders: cluster.dynamicHeaders, @@ -130,6 +138,60 @@ export function generateSkill( return spec; } +export function generateWorkflowSkill( + siteId: string, + name: string, + workflowSpec: WorkflowSpec, + options?: { + description?: string; + outputTransform?: SkillSpec['outputTransform']; + }, +): SkillSpec { + const version = 1; + const id = buildSkillId(siteId, name, version); + const now = Date.now(); + const parameters = extractWorkflowInitialParameters(workflowSpec); + + return { + id, + version, + status: SkillStatus.ACTIVE as SkillStatusName, + currentTier: TierState.TIER_3_DEFAULT as TierStateName, + tierLock: null, + allowedDomains: [siteId], + requiredCapabilities: [], + parameters, + validation: { + semanticChecks: [], + customInvariants: [], + }, + redaction: { + piiClassesFound: [], + fieldsRedacted: 0, + }, + replayStrategy: 'prefer_tier_3', + sideEffectClass: SideEffectClass.READ_ONLY, + sampleCount: 0, + consecutiveValidations: 0, + confidence: 0.5, + method: 'GET', + pathTemplate: `/__workflow/${sanitizeWorkflowName(name)}`, + inputSchema: {}, + outputTransform: options?.outputTransform, + isComposite: true, + workflowSpec, + siteId, + name, + description: options?.description, + successRate: 0, + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + createdAt: now, + updatedAt: now, + }; +} + // ─── SKILL.md Generation ──────────────────────────────────────── export function generateSkillMd(spec: SkillSpec): string { @@ -528,48 +590,335 @@ export function generateSkillTemplates(spec: SkillSpec): Map { const templates = new Map(); // request.json + const requestTemplate = buildRequestTemplate(spec); + const exportTemplateParams = buildExportTemplateParams(spec, requestTemplate); + templates.set('request.json', JSON.stringify(requestTemplate, null, 2)); + templates.set('curl.sh', generateExport(spec, 'curl', exportTemplateParams)); + templates.set('fetch.ts', generateExport(spec, 'fetch.ts', exportTemplateParams)); + templates.set('requests.py', generateExport(spec, 'requests.py', exportTemplateParams)); + templates.set('playwright.ts', generateExport(spec, 'playwright.ts', exportTemplateParams)); + + return templates; +} + +export type SkillExportFormat = 'curl' | 'fetch.ts' | 'requests.py' | 'playwright.ts'; + +export function generateExport( + skill: SkillSpec, + format: SkillExportFormat, + params?: Record, +): string { + const request = buildRequest( + skill, + params ?? buildRequestTemplate(skill), + resolveExportTier(skill), + ); + const headers = augmentHeadersForExport(skill, request.headers); + + switch (format) { + case 'curl': + return renderCurlExport(skill, request.method, request.url, headers, request.body); + case 'fetch.ts': + return renderFetchExport(skill, request.method, request.url, headers, request.body); + case 'requests.py': + return renderPythonExport(skill, request.method, request.url, headers, request.body); + case 'playwright.ts': + return renderPlaywrightExport(skill, request.method, request.url, headers, request.body); + } +} + +function buildRequestTemplate(spec: SkillSpec): Record { const requestTemplate: Record = {}; - if (spec.inputSchema) { - const props = (spec.inputSchema as Record).properties as Record> | undefined; - if (props) { - for (const [name, schema] of Object.entries(props)) { - const type = schema.type as string; - switch (type) { - case 'string': requestTemplate[name] = ''; break; - case 'number': requestTemplate[name] = 0; break; - case 'boolean': requestTemplate[name] = false; break; - case 'array': requestTemplate[name] = []; break; - case 'object': requestTemplate[name] = {}; break; - default: requestTemplate[name] = null; - } - } + if (!spec.inputSchema) { + return requestTemplate; + } + + const props = (spec.inputSchema as Record).properties as Record> | undefined; + if (!props) { + return requestTemplate; + } + + for (const [name, schema] of Object.entries(props)) { + const type = schema.type as string; + switch (type) { + case 'string': + requestTemplate[name] = ''; + break; + case 'number': + case 'integer': + requestTemplate[name] = 0; + break; + case 'boolean': + requestTemplate[name] = false; + break; + case 'array': + requestTemplate[name] = []; + break; + case 'object': + requestTemplate[name] = {}; + break; + default: + requestTemplate[name] = null; } } - templates.set('request.json', JSON.stringify(requestTemplate, null, 2)); - // curl.sh - const curlLines: string[] = ['#!/bin/bash']; - const method = spec.method.toUpperCase(); - curlLines.push(`curl -X ${method} \\`); + return requestTemplate; +} + +function buildExportTemplateParams( + spec: SkillSpec, + requestTemplate: Record, +): Record { + const upperMethod = spec.method.toUpperCase(); + return upperMethod === 'GET' || upperMethod === 'HEAD' + ? {} + : requestTemplate; +} + +function resolveExportTier(skill: SkillSpec) { + return skill.currentTier === TierState.TIER_1_PROMOTED + ? ExecutionTier.DIRECT + : ExecutionTier.BROWSER_PROXIED; +} - if (spec.requiredHeaders) { - for (const [key, value] of Object.entries(spec.requiredHeaders)) { - curlLines.push(` -H '${key}: ${value}' \\`); +function augmentHeadersForExport(skill: SkillSpec, headers: Record): Record { + const merged = { ...headers }; + for (const key of Object.keys(merged)) { + const placeholder = getExportHeaderPlaceholder(key, skill.authType); + if (placeholder) { + merged[key] = placeholder; } } - if (spec.authType === 'bearer') { - curlLines.push(` -H 'Authorization: Bearer YOUR_TOKEN' \\`); + if ((skill.authType === 'bearer' || skill.authType === 'oauth2') && !hasHeader(merged, 'authorization')) { + merged.authorization = 'Bearer YOUR_TOKEN'; + } + if (skill.authType === 'cookie' && !hasHeader(merged, 'cookie')) { + merged.cookie = 'SESSION=YOUR_COOKIE'; + } + if (skill.authType === 'api_key' && !hasHeader(merged, 'x-api-key')) { + merged['x-api-key'] = 'YOUR_API_KEY'; + } + return merged; +} + +function hasHeader(headers: Record, headerName: string): boolean { + const target = headerName.toLowerCase(); + return Object.keys(headers).some((key) => key.toLowerCase() === target); +} + +function getExportHeaderPlaceholder(headerName: string, authType?: SkillSpec['authType']): string | undefined { + const normalized = headerName.toLowerCase(); + if (normalized === 'authorization') { + return authType === 'bearer' || authType === 'oauth2' + ? 'Bearer YOUR_TOKEN' + : ''; } + if (normalized === 'cookie' || normalized === 'set-cookie') { + return 'SESSION=YOUR_COOKIE'; + } + if (normalized === 'x-api-key' || normalized === 'x-apikey' || normalized === 'api-key' || normalized === 'apikey') { + return 'YOUR_API_KEY'; + } + if ( + normalized.startsWith('x-auth') || + normalized.startsWith('x-session') || + normalized.startsWith('x-csrf') || + normalized.includes('token') || + normalized.includes('secret') || + normalized.includes('credential') + ) { + return ''; + } + return undefined; +} - if (['POST', 'PUT', 'PATCH'].includes(method)) { - curlLines.push(` -H 'Content-Type: application/json' \\`); - curlLines.push(` -d '${JSON.stringify(requestTemplate)}' \\`); +function renderCurlExport( + skill: SkillSpec, + method: string, + url: string, + headers: Record, + body?: string, +): string { + const lines = ['#!/bin/bash', ...buildTransformComments(skill, '#')]; + lines.push(`curl -X ${method.toUpperCase()} \\`); + for (const [key, value] of Object.entries(toDisplayHeaders(headers))) { + lines.push(` -H ${quoteForShell(`${key}: ${value}`)} \\`); } + if (body !== undefined) { + lines.push(` -d ${quoteForShell(body)} \\`); + } + lines.push(` ${quoteForShell(url)}`); + return lines.join('\n'); +} - curlLines.push(` 'https://${spec.siteId}${spec.pathTemplate}'`); - templates.set('curl.sh', curlLines.join('\n')); +function renderFetchExport( + skill: SkillSpec, + method: string, + url: string, + headers: Record, + body?: string, +): string { + const headerLiteral = JSON.stringify(toDisplayHeaders(headers), null, 2); + const lines = [ + ...buildTransformComments(skill, '//'), + 'const response = await fetch(', + ` ${JSON.stringify(url)},`, + ' {', + ` method: ${JSON.stringify(method.toUpperCase())},`, + ` headers: ${indentMultiline(headerLiteral, 4)},`, + ...(body !== undefined ? [` body: ${JSON.stringify(body)},`] : []), + ' },', + ');', + '', + "const data = await response.text();", + 'console.log(data);', + ]; + return lines.join('\n'); +} - return templates; +function renderPythonExport( + skill: SkillSpec, + method: string, + url: string, + headers: Record, + body?: string, +): string { + const lines = [ + ...buildTransformComments(skill, '#'), + 'import requests', + '', + `url = ${JSON.stringify(url)}`, + `headers = ${toPythonLiteral(toDisplayHeaders(headers))}`, + ...(body !== undefined ? [`data = ${JSON.stringify(body)}`] : []), + '', + `response = requests.request(${JSON.stringify(method.toUpperCase())}, url, headers=headers${body !== undefined ? ', data=data' : ''})`, + 'print(response.text)', + ]; + return lines.join('\n'); +} + +function renderPlaywrightExport( + skill: SkillSpec, + method: string, + url: string, + headers: Record, + body?: string, +): string { + const displayHeaders = toDisplayHeaders(headers); + const lines: string[] = []; + if (needsPlaywrightWarning(skill)) { + lines.push('// Warning: this skill is marked browser_required; exported code uses Playwright request context.'); + } + lines.push(...buildTransformComments(skill, '//')); + lines.push("import { chromium } from 'playwright';"); + lines.push(''); + lines.push('const browser = await chromium.launch();'); + lines.push('const page = await browser.newPage();'); + lines.push('try {'); + lines.push(` const response = await page.request.fetch(${JSON.stringify(url)}, {`); + lines.push(` method: ${JSON.stringify(method.toUpperCase())},`); + lines.push(` headers: ${indentMultiline(JSON.stringify(displayHeaders, null, 2), 4)},`); + if (body !== undefined) { + lines.push(` data: ${JSON.stringify(body)},`); + } + lines.push(' });'); + lines.push(' console.log(await response.text());'); + lines.push('} finally {'); + lines.push(' await browser.close();'); + lines.push('}'); + return lines.join('\n'); +} + +function buildTransformComments(skill: SkillSpec, prefix: string): string[] { + if (!skill.outputTransform) { + return []; + } + return [`${prefix} Transform: ${describeTransform(skill.outputTransform)}`, '']; +} + +function describeTransform(transform: NonNullable): string { + switch (transform.type) { + case 'jsonpath': + return `jsonpath ${transform.expression}${transform.label ? ` -> ${transform.label}` : ''}`; + case 'regex': + return `regex /${transform.expression}/${transform.flags ?? ''}${transform.label ? ` -> ${transform.label}` : ''}`; + case 'css': + return `css ${transform.selector}${transform.label ? ` -> ${transform.label}` : ''}`; + } +} + +function needsPlaywrightWarning(skill: SkillSpec): boolean { + return skill.tierLock?.type === 'permanent' && skill.tierLock.reason === 'browser_required'; +} + +function quoteForShell(value: string): string { + return `'${value.replace(/'/g, `'\"'\"'`)}'`; +} + +function toDisplayHeaders(headers: Record): Record { + return Object.fromEntries( + Object.entries(headers).map(([key, value]) => [formatHeaderName(key), value]), + ); +} + +function formatHeaderName(headerName: string): string { + return headerName + .split('-') + .map((segment) => { + const lowerSegment = segment.toLowerCase(); + switch (lowerSegment) { + case 'dnt': + case 'etag': + case 'te': + case 'www': + return lowerSegment.toUpperCase(); + default: + return lowerSegment.charAt(0).toUpperCase() + lowerSegment.slice(1); + } + }) + .join('-'); +} + +function indentMultiline(value: string, indent: number): string { + const padding = ' '.repeat(indent); + return value.split('\n').join(`\n${padding}`); +} + +function toPythonLiteral(value: unknown): string { + return serializePythonLiteral(value, 0); +} + +function serializePythonLiteral(value: unknown, indent: number): string { + if (value === null || value === undefined) { + return 'None'; + } + if (typeof value === 'boolean') { + return value ? 'True' : 'False'; + } + if (typeof value === 'number') { + return Number.isFinite(value) ? String(value) : 'None'; + } + if (typeof value === 'string') { + return JSON.stringify(value); + } + if (Array.isArray(value)) { + if (value.length === 0) { + return '[]'; + } + const innerIndent = ' '.repeat(indent + 2); + const closingIndent = ' '.repeat(indent); + return `[\n${value.map((item) => `${innerIndent}${serializePythonLiteral(item, indent + 2)}`).join(',\n')}\n${closingIndent}]`; + } + if (typeof value === 'object') { + const entries = Object.entries(value); + if (entries.length === 0) { + return '{}'; + } + const innerIndent = ' '.repeat(indent + 2); + const closingIndent = ' '.repeat(indent); + return `{\n${entries.map(([key, entryValue]) => `${innerIndent}${JSON.stringify(key)}: ${serializePythonLiteral(entryValue, indent + 2)}`).join(',\n')}\n${closingIndent}}`; + } + return JSON.stringify(String(value)); } function generateEvidenceReport( @@ -608,3 +957,32 @@ function generateEvidenceReport( validationHistory, }; } + +function extractWorkflowInitialParameters(workflowSpec: WorkflowSpec): SkillSpec['parameters'] { + const names = new Set(); + + for (const step of workflowSpec.steps) { + for (const ref of Object.values(step.paramMapping ?? {})) { + const match = ref.match(/^\$initial\.([A-Za-z0-9_.-]+)/); + if (match) { + names.add(match[1]); + } + } + } + + return [...names].sort().map((name) => ({ + name, + type: 'any', + source: 'user_input' as const, + evidence: ['workflow.initial'], + required: true, + })); +} + +function sanitizeWorkflowName(name: string): string { + return name + .trim() + .toLowerCase() + .replace(/[^a-z0-9]+/g, '-') + .replace(/^-+|-+$/g, '') || 'workflow'; +} diff --git a/src/skill/types.ts b/src/skill/types.ts index d0e20a8..ddf83c1 100644 --- a/src/skill/types.ts +++ b/src/skill/types.ts @@ -38,6 +38,7 @@ export const FailureCause = { SCHEMA_DRIFT: 'schema_drift', AUTH_EXPIRED: 'auth_expired', COOKIE_REFRESH: 'cookie_refresh', + CLOUDFLARE_CHALLENGE: 'cloudflare_challenge', FETCH_ERROR: 'fetch_error', UNKNOWN: 'unknown', } as const; @@ -55,6 +56,7 @@ const FAILURE_CAUSE_PRECEDENCE: FailureCauseName[] = [ FailureCause.SCHEMA_DRIFT, FailureCause.AUTH_EXPIRED, FailureCause.COOKIE_REFRESH, + FailureCause.CLOUDFLARE_CHALLENGE, FailureCause.FETCH_ERROR, FailureCause.UNKNOWN, ]; @@ -63,6 +65,7 @@ export const INFRA_FAILURE_CAUSES = new Set([ FailureCause.POLICY_DENIED, FailureCause.RATE_LIMITED, FailureCause.BUDGET_DENIED, + FailureCause.CLOUDFLARE_CHALLENGE, FailureCause.FETCH_ERROR, ]); @@ -76,7 +79,7 @@ export type TierStateName = (typeof TierState)[keyof typeof TierState]; export interface PermanentTierLock { type: 'permanent'; - reason: 'js_computed_field' | 'protocol_sensitivity' | 'signed_payload' | 'webmcp_requires_browser'; + reason: 'js_computed_field' | 'protocol_sensitivity' | 'signed_payload' | 'webmcp_requires_browser' | 'browser_required'; evidence: string; } @@ -141,6 +144,7 @@ export const RequestClassification = { NOISE: 'noise', SIGNAL: 'signal', AMBIGUOUS: 'ambiguous', + HTML_DOCUMENT: 'html_document', } as const; export type RequestClassificationName = (typeof RequestClassification)[keyof typeof RequestClassification]; @@ -218,6 +222,7 @@ export interface BrowserProvider { networkRequests(): Promise; evaluateModelContext?(req: SealedModelContextRequest): Promise; listModelContextTools?(): Promise; + detectChallengePage?(): Promise; getCurrentUrl(): string; } @@ -326,6 +331,30 @@ export interface RequestChain { canReplayWithCookiesOnly: boolean; } +export type OutputTransform = + | { type: 'jsonpath'; expression: string; label?: string } + | { type: 'regex'; expression: string; flags?: string; label?: string } + | { + type: 'css'; + selector: string; + mode?: 'text' | 'html' | 'attr' | 'list'; + attr?: string; + fields?: Record; + label?: string; + }; + +export interface WorkflowStep { + skillId: string; + name?: string; + paramMapping?: Record; + transform?: OutputTransform; + cache?: { ttlMs: number }; +} + +export interface WorkflowSpec { + steps: WorkflowStep[]; +} + // ─── Auth Recipe ─────────────────────────────────────────────────── export type AuthType = 'bearer' | 'cookie' | 'api_key' | 'oauth2'; export type RefreshTrigger = '401' | '403' | 'redirect_to_login' | 'token_expired_field'; @@ -399,11 +428,14 @@ export interface SkillSpec { pathTemplate: string; inputSchema: Record; // JSON Schema outputSchema?: Record; + outputTransform?: OutputTransform; + responseContentType?: string; authType?: AuthType; requiredHeaders?: Record; dynamicHeaders?: Record; isComposite: boolean; chainSpec?: RequestChain; + workflowSpec?: WorkflowSpec; parameterEvidence?: ParameterEvidence[]; // Metadata @@ -457,11 +489,13 @@ export interface SitePolicy { allowedMethods: HttpMethod[]; maxQps: number; maxConcurrent: number; + minGapMs?: number; readOnlyDefault: boolean; requireConfirmation: string[]; domainAllowlist: string[]; redactionRules: string[]; capabilities: CapabilityName[]; + browserRequired?: boolean; // sticky gate for challenge-protected sites executionBackend?: 'playwright' | 'agent-browser' | 'live-chrome'; // override global default for this site executionSessionName?: string; // for hard-site shared Playwright } @@ -649,6 +683,7 @@ export interface SchruteConfig { network: boolean; // default: false (v0.2+) authToken?: string; // Required when network=true httpPort?: number; // REST server port (default 3000, MCP HTTP = httpPort + 1) + mcpHttpAdmin?: boolean; // default: false — grant admin to authenticated MCP HTTP clients }; daemon: { port: number; diff --git a/src/storage/database.ts b/src/storage/database.ts index a9de3cd..16f09ee 100644 --- a/src/storage/database.ts +++ b/src/storage/database.ts @@ -299,6 +299,31 @@ ALTER TABLE sites ADD COLUMN lighthouse_accessibility REAL; filename: '012_sample_params.sql', sql: ` ALTER TABLE skills ADD COLUMN sample_params TEXT; +`, + }, + { + filename: '013_browser_required_policy.sql', + sql: ` +ALTER TABLE policies ADD COLUMN browser_required INTEGER NOT NULL DEFAULT 0; +`, + }, + { + filename: '014_output_transforms.sql', + sql: ` +ALTER TABLE skills ADD COLUMN output_transform TEXT; +ALTER TABLE skills ADD COLUMN response_content_type TEXT; +`, + }, + { + filename: '015_workflow_spec.sql', + sql: ` +ALTER TABLE skills ADD COLUMN workflow_spec TEXT; +`, + }, + { + filename: '016_site_policy_min_gap.sql', + sql: ` +ALTER TABLE policies ADD COLUMN min_gap_ms INTEGER NOT NULL DEFAULT 100; `, }, ]; diff --git a/src/storage/import-validator.ts b/src/storage/import-validator.ts index 7515f89..5cbab8e 100644 --- a/src/storage/import-validator.ts +++ b/src/storage/import-validator.ts @@ -7,12 +7,15 @@ */ import { + Capability, SkillStatus, TierState, SideEffectClass, MasteryLevel, ExecutionTier, + V01_DEFAULT_CAPABILITIES, } from '../skill/types.js'; +import type { CapabilityName, HttpMethod, SitePolicy } from '../skill/types.js'; // ─── Cached value sets ────────────────────────────────────────────── const VALID_SKILL_STATUSES: Set = new Set(Object.values(SkillStatus)); @@ -22,6 +25,20 @@ const VALID_AUTH_TYPES: Set = new Set(['bearer', 'cookie', 'api_key', 'o const VALID_REPLAY_STRATEGIES: Set = new Set(['prefer_tier_1', 'prefer_tier_3', 'tier_3_only']); const VALID_MASTERY_LEVELS: Set = new Set(Object.values(MasteryLevel)); const VALID_EXECUTION_TIERS: Set = new Set(Object.values(ExecutionTier)); +const VALID_HTTP_METHODS: Set = new Set(['GET', 'HEAD', 'POST', 'PUT', 'PATCH', 'DELETE', 'OPTIONS']); +const VALID_CAPABILITIES: Set = new Set(Object.values(Capability)); +const POLICY_DEFAULTS: Omit = { + allowedMethods: ['GET', 'HEAD'], + maxQps: 10, + maxConcurrent: 3, + minGapMs: 100, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + capabilities: [...V01_DEFAULT_CAPABILITIES], + browserRequired: false, +}; // ─── Helpers ──────────────────────────────────────────────────────── @@ -33,6 +50,111 @@ function isRecord(v: unknown): v is Record { return typeof v === 'object' && v !== null && !Array.isArray(v); } +function validateOutputTransform(value: unknown, fieldName: string, errors: string[]): void { + if (!isRecord(value)) { + errors.push(`${fieldName} must be an object`); + return; + } + + if (typeof value.type !== 'string') { + errors.push(`${fieldName}.type must be a string`); + return; + } + + switch (value.type) { + case 'jsonpath': + if (typeof value.expression !== 'string') { + errors.push(`${fieldName}.expression must be a string`); + } + break; + case 'regex': + if (typeof value.expression !== 'string') { + errors.push(`${fieldName}.expression must be a string`); + } + if (value.flags !== undefined && typeof value.flags !== 'string') { + errors.push(`${fieldName}.flags must be a string`); + } + break; + case 'css': + if (typeof value.selector !== 'string') { + errors.push(`${fieldName}.selector must be a string`); + } + if (value.mode !== undefined && !['text', 'html', 'attr', 'list'].includes(String(value.mode))) { + errors.push(`${fieldName}.mode must be one of: text, html, attr, list`); + } + if (value.attr !== undefined && typeof value.attr !== 'string') { + errors.push(`${fieldName}.attr must be a string`); + } + if (value.fields !== undefined) { + if (!isRecord(value.fields)) { + errors.push(`${fieldName}.fields must be an object`); + } else { + for (const [key, field] of Object.entries(value.fields)) { + if (!isRecord(field)) { + errors.push(`${fieldName}.fields.${key} must be an object`); + continue; + } + if (typeof field.selector !== 'string') { + errors.push(`${fieldName}.fields.${key}.selector must be a string`); + } + if (field.mode !== undefined && !['text', 'attr', 'html'].includes(String(field.mode))) { + errors.push(`${fieldName}.fields.${key}.mode must be one of: text, attr, html`); + } + if (field.attr !== undefined && typeof field.attr !== 'string') { + errors.push(`${fieldName}.fields.${key}.attr must be a string`); + } + } + } + } + break; + default: + errors.push(`${fieldName}.type must be one of: jsonpath, regex, css`); + } +} + +function validateWorkflowSpec(value: unknown, errors: string[]): void { + if (!isRecord(value)) { + errors.push('workflowSpec must be an object'); + return; + } + if (!Array.isArray(value.steps)) { + errors.push('workflowSpec: missing required field "steps" (array)'); + return; + } + for (let index = 0; index < value.steps.length; index++) { + const step = value.steps[index]; + if (!isRecord(step)) { + errors.push(`workflowSpec.steps[${index}] must be an object`); + continue; + } + if (typeof step.skillId !== 'string') { + errors.push(`workflowSpec.steps[${index}].skillId must be a string`); + } + if (step.name !== undefined && typeof step.name !== 'string') { + errors.push(`workflowSpec.steps[${index}].name must be a string`); + } + if (step.paramMapping !== undefined) { + if (!isRecord(step.paramMapping)) { + errors.push(`workflowSpec.steps[${index}].paramMapping must be an object`); + } else { + for (const [param, source] of Object.entries(step.paramMapping)) { + if (typeof source !== 'string') { + errors.push(`workflowSpec.steps[${index}].paramMapping.${param} must be a string`); + } + } + } + } + if (step.transform !== undefined) { + validateOutputTransform(step.transform, `workflowSpec.steps[${index}].transform`, errors); + } + if (step.cache !== undefined) { + if (!isRecord(step.cache) || typeof step.cache.ttlMs !== 'number') { + errors.push(`workflowSpec.steps[${index}].cache.ttlMs must be a number`); + } + } + } +} + // ─── Skill validator ──────────────────────────────────────────────── export function validateImportableSkill(skill: unknown): { valid: boolean; errors: string[] } { @@ -170,6 +292,18 @@ export function validateImportableSkill(skill: unknown): { valid: boolean; error } } + if (skill.outputTransform !== undefined && skill.outputTransform !== null) { + validateOutputTransform(skill.outputTransform, 'outputTransform', errors); + } + + if (skill.responseContentType !== undefined && skill.responseContentType !== null && typeof skill.responseContentType !== 'string') { + errors.push('responseContentType must be a string'); + } + + if (skill.workflowSpec !== undefined && skill.workflowSpec !== null) { + validateWorkflowSpec(skill.workflowSpec, errors); + } + return { valid: errors.length === 0, errors }; } @@ -225,3 +359,176 @@ export function validateImportableSite(site: unknown): { valid: boolean; errors: return { valid: errors.length === 0, errors }; } + +const VALID_EXECUTION_BACKENDS: Set = new Set(['playwright', 'agent-browser', 'live-chrome']); + +function validateStringArrayField( + value: unknown, + fieldName: string, + errors: string[], +): string[] | undefined { + if (value === undefined || value === null) return undefined; + if (!Array.isArray(value)) { + errors.push(`${fieldName} must be an array`); + return undefined; + } + + const normalized: string[] = []; + for (let i = 0; i < value.length; i++) { + if (typeof value[i] !== 'string') { + errors.push(`${fieldName}[${i}] must be a string`); + return undefined; + } + normalized.push(value[i]); + } + return normalized; +} + +export function validateAndNormalizeImportablePolicy( + policy: unknown, + siteId: string, +): { valid: boolean; errors: string[]; value?: SitePolicy } { + const errors: string[] = []; + + if (!isRecord(policy)) { + return { valid: false, errors: ['policy is not an object'] }; + } + + let allowedMethods: HttpMethod[] | undefined; + if (policy.allowedMethods !== undefined && policy.allowedMethods !== null) { + if (!Array.isArray(policy.allowedMethods)) { + errors.push('allowedMethods must be an array'); + } else { + const normalizedMethods: HttpMethod[] = []; + for (let i = 0; i < policy.allowedMethods.length; i++) { + const method = policy.allowedMethods[i]; + if (typeof method !== 'string') { + errors.push(`allowedMethods[${i}] must be a string`); + break; + } + if (!VALID_HTTP_METHODS.has(method)) { + errors.push(`allowedMethods[${i}] has invalid HTTP method "${method}"`); + break; + } + normalizedMethods.push(method as HttpMethod); + } + if (errors.length === 0 || normalizedMethods.length === policy.allowedMethods.length) { + allowedMethods = normalizedMethods; + } + } + } + + let maxQps: number | undefined; + if (policy.maxQps !== undefined && policy.maxQps !== null) { + if (typeof policy.maxQps !== 'number' || !Number.isFinite(policy.maxQps)) { + errors.push('maxQps must be a finite number when provided'); + } else { + maxQps = policy.maxQps; + } + } + + let maxConcurrent: number | undefined; + if (policy.maxConcurrent !== undefined && policy.maxConcurrent !== null) { + if (typeof policy.maxConcurrent !== 'number' || !Number.isFinite(policy.maxConcurrent)) { + errors.push('maxConcurrent must be a finite number when provided'); + } else { + maxConcurrent = policy.maxConcurrent; + } + } + + let minGapMs: number | undefined; + if (policy.minGapMs !== undefined && policy.minGapMs !== null) { + if (typeof policy.minGapMs !== 'number' || !Number.isFinite(policy.minGapMs) || policy.minGapMs < 0) { + errors.push('minGapMs must be a finite number >= 0 when provided'); + } else { + minGapMs = policy.minGapMs; + } + } + + let readOnlyDefault: boolean | undefined; + if (policy.readOnlyDefault !== undefined && policy.readOnlyDefault !== null) { + if (typeof policy.readOnlyDefault !== 'boolean') { + errors.push('readOnlyDefault must be a boolean when provided'); + } else { + readOnlyDefault = policy.readOnlyDefault; + } + } + + const requireConfirmation = validateStringArrayField(policy.requireConfirmation, 'requireConfirmation', errors); + const domainAllowlist = validateStringArrayField(policy.domainAllowlist, 'domainAllowlist', errors); + const redactionRules = validateStringArrayField(policy.redactionRules, 'redactionRules', errors); + + let capabilities: CapabilityName[] | undefined; + if (policy.capabilities !== undefined && policy.capabilities !== null) { + if (!Array.isArray(policy.capabilities)) { + errors.push('capabilities must be an array'); + } else { + const normalizedCapabilities: CapabilityName[] = []; + for (let i = 0; i < policy.capabilities.length; i++) { + const capability = policy.capabilities[i]; + if (typeof capability !== 'string') { + errors.push(`capabilities[${i}] must be a string`); + break; + } + if (!VALID_CAPABILITIES.has(capability)) { + errors.push(`capabilities[${i}] has invalid capability "${capability}"`); + break; + } + normalizedCapabilities.push(capability as CapabilityName); + } + if (errors.length === 0 || normalizedCapabilities.length === policy.capabilities.length) { + capabilities = normalizedCapabilities; + } + } + } + + if (policy.browserRequired !== undefined && typeof policy.browserRequired !== 'boolean') { + errors.push('browserRequired must be a boolean when provided'); + } + + if (policy.executionBackend !== undefined && policy.executionBackend !== null) { + if (typeof policy.executionBackend !== 'string' || !VALID_EXECUTION_BACKENDS.has(policy.executionBackend)) { + errors.push( + `invalid executionBackend "${String(policy.executionBackend)}". Expected one of: ${[...VALID_EXECUTION_BACKENDS].join(', ')}`, + ); + } + } + + if (policy.executionSessionName !== undefined && policy.executionSessionName !== null) { + if (typeof policy.executionSessionName !== 'string') { + errors.push('executionSessionName must be a string when provided'); + } else if (policy.executionBackend !== 'playwright' && policy.executionBackend !== 'live-chrome') { + errors.push(`executionSessionName requires executionBackend='playwright' or 'live-chrome'`); + } + } + + if (errors.length > 0) { + return { valid: false, errors }; + } + + const normalized: SitePolicy = { + siteId, + allowedMethods: allowedMethods ?? [...POLICY_DEFAULTS.allowedMethods], + maxQps: maxQps ?? POLICY_DEFAULTS.maxQps, + maxConcurrent: maxConcurrent ?? POLICY_DEFAULTS.maxConcurrent, + minGapMs: minGapMs ?? POLICY_DEFAULTS.minGapMs, + readOnlyDefault: readOnlyDefault ?? POLICY_DEFAULTS.readOnlyDefault, + requireConfirmation: requireConfirmation ?? [...POLICY_DEFAULTS.requireConfirmation], + domainAllowlist: domainAllowlist ?? [...POLICY_DEFAULTS.domainAllowlist], + redactionRules: redactionRules ?? [...POLICY_DEFAULTS.redactionRules], + capabilities: capabilities ?? [...POLICY_DEFAULTS.capabilities], + browserRequired: policy.browserRequired === true, + ...(typeof policy.executionBackend === 'string' + ? { executionBackend: policy.executionBackend as SitePolicy['executionBackend'] } + : {}), + ...(typeof policy.executionSessionName === 'string' + ? { executionSessionName: policy.executionSessionName } + : {}), + }; + + return { + valid: true, + errors: [], + value: normalized, + }; +} diff --git a/src/storage/migrations/001_initial.sql b/src/storage/migrations/001_initial.sql index 42a0cbe..5299757 100644 --- a/src/storage/migrations/001_initial.sql +++ b/src/storage/migrations/001_initial.sql @@ -123,6 +123,7 @@ CREATE TABLE IF NOT EXISTS policies ( allowed_methods TEXT NOT NULL DEFAULT '["GET","HEAD","POST:read-only"]', max_qps REAL NOT NULL DEFAULT 1.0, max_concurrent INTEGER NOT NULL DEFAULT 1, + min_gap_ms INTEGER NOT NULL DEFAULT 100, read_only_default INTEGER NOT NULL DEFAULT 1, require_confirmation TEXT NOT NULL DEFAULT '[]', domain_allowlist TEXT, -- JSON diff --git a/src/storage/migrations/013_browser_required_policy.sql b/src/storage/migrations/013_browser_required_policy.sql new file mode 100644 index 0000000..49778dd --- /dev/null +++ b/src/storage/migrations/013_browser_required_policy.sql @@ -0,0 +1 @@ +ALTER TABLE policies ADD COLUMN browser_required INTEGER NOT NULL DEFAULT 0; diff --git a/src/storage/migrations/014_output_transforms.sql b/src/storage/migrations/014_output_transforms.sql new file mode 100644 index 0000000..9fc41dc --- /dev/null +++ b/src/storage/migrations/014_output_transforms.sql @@ -0,0 +1,2 @@ +ALTER TABLE skills ADD COLUMN output_transform TEXT; +ALTER TABLE skills ADD COLUMN response_content_type TEXT; diff --git a/src/storage/migrations/015_workflow_spec.sql b/src/storage/migrations/015_workflow_spec.sql new file mode 100644 index 0000000..4be6674 --- /dev/null +++ b/src/storage/migrations/015_workflow_spec.sql @@ -0,0 +1 @@ +ALTER TABLE skills ADD COLUMN workflow_spec TEXT; diff --git a/src/storage/migrations/016_site_policy_min_gap.sql b/src/storage/migrations/016_site_policy_min_gap.sql new file mode 100644 index 0000000..308a3be --- /dev/null +++ b/src/storage/migrations/016_site_policy_min_gap.sql @@ -0,0 +1 @@ +ALTER TABLE policies ADD COLUMN min_gap_ms INTEGER NOT NULL DEFAULT 100; diff --git a/src/storage/skill-repository.ts b/src/storage/skill-repository.ts index c6ad17a..a33929c 100644 --- a/src/storage/skill-repository.ts +++ b/src/storage/skill-repository.ts @@ -8,12 +8,14 @@ import type { SideEffectClassName, AuthType, RequestChain, + OutputTransform, ParameterEvidence, CapabilityName, SkillParameter, SkillValidation, SkillRedactionInfo, ReplayStrategy, + WorkflowSpec, } from '../skill/types.js'; import { SkillStatus, @@ -108,6 +110,79 @@ function assertRequestChainShape(value: unknown): RequestChain { return value as RequestChain; } +function isRecord(value: unknown): value is Record { + return typeof value === 'object' && value !== null && !Array.isArray(value); +} + +function assertOutputTransformShape(value: unknown): OutputTransform { + if (!isRecord(value) || typeof value.type !== 'string') { + throw new Error('Invalid OutputTransform shape: expected object with string "type"'); + } + if (value.type === 'jsonpath') { + if (typeof value.expression !== 'string') { + throw new Error('Invalid OutputTransform.jsonpath: missing "expression"'); + } + return value as OutputTransform; + } + if (value.type === 'regex') { + if (typeof value.expression !== 'string') { + throw new Error('Invalid OutputTransform.regex: missing "expression"'); + } + if (value.flags !== undefined && typeof value.flags !== 'string') { + throw new Error('Invalid OutputTransform.regex: "flags" must be a string'); + } + return value as OutputTransform; + } + if (value.type === 'css') { + if (typeof value.selector !== 'string') { + throw new Error('Invalid OutputTransform.css: missing "selector"'); + } + if (value.fields !== undefined) { + if (!isRecord(value.fields)) { + throw new Error('Invalid OutputTransform.css: "fields" must be an object'); + } + for (const [key, field] of Object.entries(value.fields)) { + if (!isRecord(field) || typeof field.selector !== 'string') { + throw new Error(`Invalid OutputTransform.css field "${key}": missing "selector"`); + } + } + } + return value as OutputTransform; + } + throw new Error(`Invalid OutputTransform shape: unknown type "${String(value.type)}"`); +} + +function assertWorkflowSpecShape(value: unknown): WorkflowSpec { + if (!isRecord(value) || !Array.isArray(value.steps)) { + throw new Error('Invalid WorkflowSpec shape: missing "steps" array'); + } + for (let index = 0; index < value.steps.length; index++) { + const step = value.steps[index]; + if (!isRecord(step) || typeof step.skillId !== 'string') { + throw new Error(`Invalid WorkflowSpec step at index ${index}: missing "skillId"`); + } + if (step.paramMapping !== undefined) { + if (!isRecord(step.paramMapping)) { + throw new Error(`Invalid WorkflowSpec step at index ${index}: "paramMapping" must be an object`); + } + for (const [param, source] of Object.entries(step.paramMapping)) { + if (typeof source !== 'string') { + throw new Error(`Invalid WorkflowSpec step at index ${index}: paramMapping["${param}"] must be a string`); + } + } + } + if (step.transform !== undefined) { + assertOutputTransformShape(step.transform); + } + if (step.cache !== undefined) { + if (!isRecord(step.cache) || typeof step.cache.ttlMs !== 'number') { + throw new Error(`Invalid WorkflowSpec step at index ${index}: cache.ttlMs must be a number`); + } + } + } + return value as unknown as WorkflowSpec; +} + interface SkillRow { id: string; site_id: string; @@ -119,12 +194,15 @@ interface SkillRow { path_template: string; input_schema: string | null; output_schema: string | null; + output_transform: string | null; + response_content_type: string | null; auth_type: string | null; required_headers: string | null; dynamic_headers: string | null; side_effect_class: string; is_composite: number; chain_spec: string | null; + workflow_spec: string | null; current_tier: string; tier_lock: string | null; confidence: number; @@ -182,12 +260,19 @@ function rowToSkill(row: SkillRow): SkillSpec { pathTemplate: row.path_template, inputSchema: parseJson>(row.input_schema, {}), outputSchema: row.output_schema ? parseJson>(row.output_schema, {}) : undefined, + outputTransform: row.output_transform + ? parseJson(row.output_transform, undefined as unknown as OutputTransform, assertOutputTransformShape) + : undefined, + responseContentType: row.response_content_type ?? undefined, authType: row.auth_type ? validateAuthType(row.auth_type) : undefined, requiredHeaders: row.required_headers ? parseJson>(row.required_headers, {}) : undefined, dynamicHeaders: row.dynamic_headers ? parseJson>(row.dynamic_headers, {}) : undefined, sideEffectClass: validateSideEffectClass(row.side_effect_class), isComposite: row.is_composite === 1, chainSpec: row.chain_spec ? parseJson(row.chain_spec, undefined as unknown as RequestChain, assertRequestChainShape) : undefined, + workflowSpec: row.workflow_spec + ? parseJson(row.workflow_spec, undefined as unknown as WorkflowSpec, assertWorkflowSpecShape) + : undefined, currentTier: validateTierState(row.current_tier), tierLock: parseJson(row.tier_lock, null, assertTierLockShape), confidence: row.confidence, @@ -230,8 +315,9 @@ export class SkillRepository { this.db.run( `INSERT INTO skills ( id, site_id, name, version, status, description, method, path_template, - input_schema, output_schema, auth_type, required_headers, dynamic_headers, - side_effect_class, is_composite, chain_spec, current_tier, tier_lock, + input_schema, output_schema, output_transform, response_content_type, + auth_type, required_headers, dynamic_headers, + side_effect_class, is_composite, chain_spec, workflow_spec, current_tier, tier_lock, confidence, consecutive_validations, sample_count, parameter_evidence, last_verified, last_used, success_rate, skill_md, openapi_fragment, created_at, updated_at, @@ -239,7 +325,7 @@ export class SkillRepository { avg_latency_ms, last_successful_tier, direct_canary_eligible, direct_canary_attempts, validations_since_last_canary, last_canary_error_type, review_required, sample_params - ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, skill.id, skill.siteId, skill.name, @@ -250,12 +336,15 @@ export class SkillRepository { skill.pathTemplate, JSON.stringify(skill.inputSchema), skill.outputSchema ? JSON.stringify(skill.outputSchema) : null, + skill.outputTransform ? JSON.stringify(skill.outputTransform) : null, + skill.responseContentType ?? null, skill.authType ?? null, skill.requiredHeaders ? JSON.stringify(skill.requiredHeaders) : null, skill.dynamicHeaders ? JSON.stringify(skill.dynamicHeaders) : null, skill.sideEffectClass, skill.isComposite ? 1 : 0, skill.chainSpec ? JSON.stringify(skill.chainSpec) : null, + skill.workflowSpec ? JSON.stringify(skill.workflowSpec) : null, skill.currentTier, skill.tierLock ? JSON.stringify(skill.tierLock) : null, skill.confidence, @@ -332,7 +421,7 @@ export class SkillRepository { const directColumns: Array<[keyof typeof updates, string]> = [ ['name', 'name'], ['version', 'version'], ['status', 'status'], ['description', 'description'], ['method', 'method'], - ['pathTemplate', 'path_template'], ['authType', 'auth_type'], + ['pathTemplate', 'path_template'], ['responseContentType', 'response_content_type'], ['authType', 'auth_type'], ['sideEffectClass', 'side_effect_class'], ['currentTier', 'current_tier'], ['confidence', 'confidence'], ['consecutiveValidations', 'consecutive_validations'], ['sampleCount', 'sample_count'], ['lastVerified', 'last_verified'], @@ -347,8 +436,9 @@ export class SkillRepository { // JSON-serialized columns const jsonColumns: Array<[keyof typeof updates, string]> = [ ['inputSchema', 'input_schema'], ['outputSchema', 'output_schema'], + ['outputTransform', 'output_transform'], ['requiredHeaders', 'required_headers'], ['dynamicHeaders', 'dynamic_headers'], - ['chainSpec', 'chain_spec'], ['parameterEvidence', 'parameter_evidence'], + ['chainSpec', 'chain_spec'], ['workflowSpec', 'workflow_spec'], ['parameterEvidence', 'parameter_evidence'], ['allowedDomains', 'allowed_domains'], ['requiredCapabilities', 'required_capabilities'], ['parameters', 'parameters'], ['validation', 'validation'], ['redaction', 'redaction'], ['sampleParams', 'sample_params'], @@ -364,7 +454,7 @@ export class SkillRepository { for (const [prop, col] of jsonColumns) { if (updates[prop] !== undefined) { fields.push(`${col} = ?`); - values.push(JSON.stringify(updates[prop])); + values.push(updates[prop] === null ? null : JSON.stringify(updates[prop])); } } diff --git a/tests/e2e/dogfood-engine.test.ts b/tests/e2e/dogfood-engine.test.ts index db9f8b0..4db90ba 100644 --- a/tests/e2e/dogfood-engine.test.ts +++ b/tests/e2e/dogfood-engine.test.ts @@ -147,6 +147,7 @@ vi.mock('../../src/replay/tool-budget.js', () => ({ vi.mock('../../src/automation/rate-limiter.js', () => ({ RateLimiter: vi.fn().mockImplementation(() => ({ checkRate: vi.fn().mockReturnValue({ allowed: true }), + waitForPermit: vi.fn().mockResolvedValue({ allowed: true }), recordResponse: vi.fn(), setQps: vi.fn(), attachDatabase: vi.fn(), diff --git a/tests/e2e/dogfood-mcp-stdio.test.ts b/tests/e2e/dogfood-mcp-stdio.test.ts index 62c6cc4..a61c341 100644 --- a/tests/e2e/dogfood-mcp-stdio.test.ts +++ b/tests/e2e/dogfood-mcp-stdio.test.ts @@ -98,7 +98,7 @@ class McpClient { const timeout = setTimeout(() => { this.pending.delete(id); reject(new Error(`MCP timeout: ${method} (id=${id})`)); - }, 30000); + }, 60000); this.pending.set(id, { resolve: (resp) => { clearTimeout(timeout); resolve(resp); }, @@ -196,7 +196,7 @@ describe('Dogfood E2E: MCP Stdio — All Meta Tools', () => { const result = resp.result as any; expect(result.serverInfo?.name).toBe('schrute'); expect(result.capabilities?.tools).toBeDefined(); - }, 10000); + }, 30000); }); // ═══════════════════════════════════════════════════════════════ @@ -301,7 +301,7 @@ describe('Dogfood E2E: MCP Stdio — All Meta Tools', () => { const data = parseToolResult(result); expect(data.sessionId).toBeDefined(); expect(data.siteId).toBe('127.0.0.1'); - }, 30000); + }, 60000); it('status shows exploring after explore', async () => { const result = await client.callTool('schrute_status'); diff --git a/tests/e2e/e2e-features.test.ts b/tests/e2e/e2e-features.test.ts index 87b83a8..616f96a 100644 --- a/tests/e2e/e2e-features.test.ts +++ b/tests/e2e/e2e-features.test.ts @@ -150,6 +150,7 @@ vi.mock('../../src/replay/tool-budget.js', () => ({ vi.mock('../../src/automation/rate-limiter.js', () => ({ RateLimiter: vi.fn().mockImplementation(() => ({ checkRate: vi.fn().mockReturnValue({ allowed: true }), + waitForPermit: vi.fn().mockResolvedValue({ allowed: true }), recordResponse: vi.fn(), setQps: vi.fn(), attachDatabase: vi.fn(), diff --git a/tests/e2e/mcp-e2e.test.ts b/tests/e2e/mcp-e2e.test.ts index fe3082a..fa7aba0 100644 --- a/tests/e2e/mcp-e2e.test.ts +++ b/tests/e2e/mcp-e2e.test.ts @@ -216,7 +216,7 @@ describe('MCP E2E: full lifecycle via stdio server', () => { if (client) await client.close(); if (mockServer) await mockServer.close(); try { rmSync(tempDir, { recursive: true, force: true }); } catch { /* best effort */ } - }, 10000); + }, 30000); it('initializes MCP handshake', async () => { const resp = await client.initialize(); @@ -260,7 +260,7 @@ describe('MCP E2E: full lifecycle via stdio server', () => { const data = JSON.parse(result.content[0].text); expect(data.sessionId).toBeDefined(); expect(data.siteId).toBe('127.0.0.1'); - }, 30000); + }, 60000); it('schrute_status shows exploring after explore', async () => { const result = await client.callTool('schrute_status'); diff --git a/tests/e2e/v02-mcp-http.test.ts b/tests/e2e/v02-mcp-http.test.ts index a7708ff..82c0d58 100644 --- a/tests/e2e/v02-mcp-http.test.ts +++ b/tests/e2e/v02-mcp-http.test.ts @@ -159,7 +159,8 @@ describe('v0.2 MCP HTTP — Skill Tool Conversion', () => { ] as any[]; const result = rankToolsByIntent(skills, 'login', 2); - expect(result.length).toBe(2); + // Only the 'login' skill lexically matches the query + expect(result.length).toBe(1); expect(result[0].name).toBe('login'); }); }); diff --git a/tests/global-setup.ts b/tests/global-setup.ts new file mode 100644 index 0000000..bdb9d55 --- /dev/null +++ b/tests/global-setup.ts @@ -0,0 +1,18 @@ +import fs from 'node:fs'; +import path from 'node:path'; +import os from 'node:os'; +import crypto from 'node:crypto'; + +let testDir: string; + +export function setup() { + testDir = path.join(os.tmpdir(), `schrute-test-${crypto.randomUUID()}`); + fs.mkdirSync(path.join(testDir, 'data'), { recursive: true }); + process.env.SCHRUTE_DATA_DIR = testDir; +} + +export function teardown() { + if (testDir && testDir.includes('schrute-test-')) { + fs.rmSync(testDir, { recursive: true, force: true }); + } +} diff --git a/tests/helpers.ts b/tests/helpers.ts index 167e945..0530d40 100644 --- a/tests/helpers.ts +++ b/tests/helpers.ts @@ -219,6 +219,7 @@ export function makeSitePolicy(overrides?: Partial): SitePolicy { allowedMethods: ['GET', 'HEAD'], maxQps: 10, maxConcurrent: 3, + minGapMs: 100, readOnlyDefault: true, requireConfirmation: [], domainAllowlist: ['example.com'], diff --git a/tests/integration/mcp-wiring.test.ts b/tests/integration/mcp-wiring.test.ts index 258d9f5..989298e 100644 --- a/tests/integration/mcp-wiring.test.ts +++ b/tests/integration/mcp-wiring.test.ts @@ -377,7 +377,7 @@ describe('MCP wiring integration', () => { // ═══════════════════════════════════════════════════════════════ describe('Execution wiring (engine.executeSkill -> live server)', () => { - const EXECUTION_TIMEOUT_MS = 30_000; + const EXECUTION_TIMEOUT_MS = 60_000; let mockServer: Awaited>; let testSkillId: string; @@ -442,11 +442,11 @@ describe('MCP wiring integration', () => { } as SkillSpec; skillRepo.create(skill); - }); + }, EXECUTION_TIMEOUT_MS); afterEach(async () => { if (mockServer) await mockServer.close(); - }); + }, EXECUTION_TIMEOUT_MS); it('increments validation counters on success (A1)', async () => { const before = skillRepo.getById(testSkillId)!; diff --git a/tests/live/coingecko.test.ts b/tests/live/coingecko.test.ts new file mode 100644 index 0000000..cff9f03 --- /dev/null +++ b/tests/live/coingecko.test.ts @@ -0,0 +1,149 @@ +/** + * Live integration tests against CoinGecko (Cloudflare-protected). + * + * These tests verify: + * 1. Browser-required skills are correctly locked + * 2. CoinGecko skills exist and have correct metadata + * 3. Direct HTTP to CoinGecko fails (Cloudflare blocks it) + * 4. Export for browser_required skills includes warning + * 5. Transform on price chart data extracts latest price + * + * Run manually: npx vitest run tests/live/coingecko.test.ts --timeout 30000 + * NOTE: These tests do NOT launch a browser — they verify skill metadata + * and test that direct HTTP is correctly blocked by Cloudflare. + */ + +import { describe, it, expect, beforeAll, afterAll } from 'vitest'; +import { getConfig, ensureDirectories } from '../../src/core/config.js'; +import { createLogger } from '../../src/core/logger.js'; +import { getDatabase, closeDatabase } from '../../src/storage/database.js'; +import { SkillRepository } from '../../src/storage/skill-repository.js'; +import type { SchruteConfig } from '../../src/skill/types.js'; +import type { AgentDatabase } from '../../src/storage/database.js'; +import { applyTransform } from '../../src/replay/transform.js'; +import { generateExport } from '../../src/skill/generator.js'; + +let config: SchruteConfig; +let db: AgentDatabase; +let skillRepo: SkillRepository; + +describe('coingecko.com live integration', () => { + beforeAll(() => { + config = getConfig(); + createLogger(config.logLevel); + ensureDirectories(config); + db = getDatabase(config); + skillRepo = new SkillRepository(db); + }); + + afterAll(() => { + closeDatabase(); + }); + + it('has learned CoinGecko skills from previous sessions', () => { + const skills = skillRepo.getBySiteId('www.coingecko.com'); + expect(skills.length).toBeGreaterThan(0); + + const chartSkill = skills.find(s => s.id === 'www_coingecko_com.get_24_hours_json.v1'); + expect(chartSkill).toBeDefined(); + expect(chartSkill!.status).toBe('active'); + expect(chartSkill!.method).toBe('GET'); + expect(chartSkill!.pathTemplate).toBe('/price_charts/bitcoin/usd/24_hours.json'); + }); + + it('CoinGecko skills have browser_required tier lock', () => { + const chartSkill = skillRepo.getById('www_coingecko_com.get_24_hours_json.v1'); + expect(chartSkill).toBeDefined(); + expect(chartSkill!.tierLock).toBeDefined(); + expect(chartSkill!.tierLock?.type).toBe('permanent'); + expect(chartSkill!.tierLock?.reason).toBe('browser_required'); + }); + + it('direct HTTP to CoinGecko price chart is blocked by Cloudflare', async () => { + // CoinGecko returns a Cloudflare challenge page for direct HTTP + const response = await fetch( + 'https://www.coingecko.com/price_charts/bitcoin/usd/24_hours.json', + { headers: { accept: 'application/json' } }, + ); + + // Cloudflare may return 403, 503, or a challenge page with 200 + // The key assertion: it does NOT return valid JSON chart data + const text = await response.text(); + const isBlocked = response.status === 403 + || response.status === 503 + || text.includes('challenge') + || text.includes('Cloudflare') + || text.includes('cf-browser-verification'); + + // Either blocked by status code or by challenge page content + if (response.status === 200) { + // If status is 200, verify it's a challenge page, not real data + try { + const data = JSON.parse(text); + // If it parses as JSON with stats array, Cloudflare let it through (rare) + if (data.stats && Array.isArray(data.stats)) { + // Cloudflare sometimes allows direct requests — this is acceptable + expect(data.stats.length).toBeGreaterThan(0); + } + } catch { + // Not JSON — it's a Cloudflare challenge HTML page + expect(isBlocked).toBe(true); + } + } else { + expect([403, 503]).toContain(response.status); + } + }); + + it('export for browser_required skill includes warning', () => { + const skill = skillRepo.getById('www_coingecko_com.get_24_hours_json.v1'); + expect(skill).toBeDefined(); + + const playwrightExport = generateExport(skill!, 'playwright.ts'); + expect(playwrightExport).toContain('browser_required'); + expect(playwrightExport).toContain('chromium'); + + // curl export should also work + const curlExport = generateExport(skill!, 'curl'); + expect(curlExport).toContain('curl'); + expect(curlExport).toContain('coingecko.com'); + }); + + it('jsonpath transform extracts latest BTC price from chart data', async () => { + // Simulate the shape of CoinGecko's chart response + const mockChartData = { + stats: [ + [1774169696954, 68708.85], + [1774169964640, 68785.73], + ], + total_volumes: [ + [1774169696954, 27377622020.50], + [1774169964640, 27774473651.30], + ], + }; + + const result = await applyTransform(mockChartData, { + type: 'jsonpath', + expression: '$.stats[(@.length-1)][1]', + label: 'latest_btc_price', + }); + + expect(result.transformApplied).toBe(true); + expect(result.label).toBe('latest_btc_price'); + // jsonpath may return the value directly or wrapped — accept either + const price = Array.isArray(result.data) ? result.data[0] : result.data; + expect(typeof price).toBe('number'); + expect(price).toBeCloseTo(68785.73, 1); + }); + + it('multiple CoinGecko skills exist with correct sideEffectClass', () => { + const skills = skillRepo.getBySiteId('www.coingecko.com'); + const activeSkills = skills.filter(s => s.status === 'active'); + + expect(activeSkills.length).toBeGreaterThan(1); + + // All CoinGecko skills should be read-only + for (const skill of activeSkills) { + expect(skill.sideEffectClass).toBe('read-only'); + } + }); +}); diff --git a/tests/live/hackernews.test.ts b/tests/live/hackernews.test.ts new file mode 100644 index 0000000..1ab2a9e --- /dev/null +++ b/tests/live/hackernews.test.ts @@ -0,0 +1,161 @@ +/** + * Live integration tests against Hacker News (HTML-only site). + * + * Verifies that: + * 1. HTML responses are fetchable and parseable + * 2. CSS transforms extract structured data from HN HTML + * 3. List extraction works on real HTML tables + * 4. The noise filter classifies HN HTML as html_document (not noise) + * + * Run manually: npx vitest run tests/live/hackernews.test.ts --timeout 30000 + */ + +import { describe, it, expect } from 'vitest'; +import { applyTransform } from '../../src/replay/transform.js'; +import { filterRequests, type HarEntry } from '../../src/capture/noise-filter.js'; + +describe('news.ycombinator.com live integration', () => { + it('fetches HN front page as HTML', async () => { + const response = await fetch('https://news.ycombinator.com/', { + headers: { accept: 'text/html' }, + }); + + expect(response.status).toBe(200); + const contentType = response.headers.get('content-type') ?? ''; + expect(contentType).toContain('text/html'); + + const html = await response.text(); + expect(html).toContain(' { + const response = await fetch('https://news.ycombinator.com/'); + const html = await response.text(); + + const result = await applyTransform(html, { + type: 'css', + selector: 'title', + mode: 'text', + label: 'page_title', + }); + + expect(result.transformApplied).toBe(true); + expect(result.label).toBe('page_title'); + expect(typeof result.data).toBe('string'); + expect(result.data).toContain('Hacker News'); + }); + + it('CSS list transform extracts story titles from HN', async () => { + const response = await fetch('https://news.ycombinator.com/'); + const html = await response.text(); + + const result = await applyTransform(html, { + type: 'css', + selector: 'tr.athing', + mode: 'list', + fields: { + title: { selector: '.titleline > a', mode: 'text' }, + link: { selector: '.titleline > a', mode: 'attr', attr: 'href' }, + }, + label: 'hn_stories', + }); + + expect(result.transformApplied).toBe(true); + expect(result.label).toBe('hn_stories'); + expect(Array.isArray(result.data)).toBe(true); + + const stories = result.data as Array<{ title: string; link: string }>; + expect(stories.length).toBeGreaterThan(10); + expect(stories.length).toBeLessThanOrEqual(30); + + // Each story should have a non-empty title and link + for (const story of stories.slice(0, 5)) { + expect(story.title).toBeTruthy(); + expect(typeof story.title).toBe('string'); + expect(story.title.length).toBeGreaterThan(0); + // Links can be relative (/item?id=...) or absolute + expect(story.link).toBeTruthy(); + } + }); + + it('CSS attr transform extracts story links from HN', async () => { + const response = await fetch('https://news.ycombinator.com/'); + const html = await response.text(); + + const result = await applyTransform(html, { + type: 'css', + selector: '.titleline > a', + mode: 'attr', + attr: 'href', + label: 'first_story_link', + }); + + expect(result.transformApplied).toBe(true); + expect(typeof result.data).toBe('string'); + expect((result.data as string).length).toBeGreaterThan(0); + }); + + it('noise filter classifies HN GET 200 HTML as html_document', () => { + // Build a synthetic HAR entry matching HN's response pattern + const hnEntry: HarEntry = { + request: { + method: 'GET', + url: 'https://news.ycombinator.com/', + headers: [{ name: 'accept', value: 'text/html' }], + }, + response: { + status: 200, + headers: [{ name: 'content-type', value: 'text/html; charset=utf-8' }], + content: { size: 50000, mimeType: 'text/html' }, + }, + _resourceType: 'document', + } as unknown as HarEntry; + + const result = filterRequests([hnEntry], [], 'news.ycombinator.com'); + + expect(result.htmlDocument.length).toBe(1); + expect(result.signal.length).toBe(0); + expect(result.noise.length).toBe(0); + }); + + it('noise filter classifies HN POST as ambiguous (not html_document)', () => { + const hnPostEntry: HarEntry = { + request: { + method: 'POST', + url: 'https://news.ycombinator.com/vote', + headers: [{ name: 'content-type', value: 'application/x-www-form-urlencoded' }], + }, + response: { + status: 200, + headers: [{ name: 'content-type', value: 'text/html; charset=utf-8' }], + content: { size: 1000, mimeType: 'text/html' }, + }, + _resourceType: 'document', + } as unknown as HarEntry; + + const result = filterRequests([hnPostEntry], [], 'news.ycombinator.com'); + + // POST HTML should NOT be classified as html_document + expect(result.htmlDocument.length).toBe(0); + }); + + it('regex transform extracts score numbers from HN HTML', async () => { + const response = await fetch('https://news.ycombinator.com/'); + const html = await response.text(); + + const result = await applyTransform(html, { + type: 'regex', + expression: '(\\d+)\\s+points', + flags: 'g', + label: 'story_scores', + }); + + expect(result.transformApplied).toBe(true); + expect(Array.isArray(result.data)).toBe(true); + + const scores = result.data as string[]; + expect(scores.length).toBeGreaterThan(5); + }); +}); diff --git a/tests/live/httpbin.test.ts b/tests/live/httpbin.test.ts new file mode 100644 index 0000000..e254510 --- /dev/null +++ b/tests/live/httpbin.test.ts @@ -0,0 +1,161 @@ +/** + * Live integration tests against httpbin.org. + * + * These tests hit real HTTP endpoints — do NOT run in CI. + * Run manually: npx vitest run tests/live/httpbin.test.ts --timeout 30000 + * + * Tests: + * 1. Direct HTTP execution of a Tier 1 skill + * 2. Response transform (jsonpath) on live data + * 3. Export generates runnable curl command + * 4. Skill search returns real results + */ + +import { describe, it, expect, beforeAll, afterAll } from 'vitest'; +import { getConfig, ensureDirectories } from '../../src/core/config.js'; +import { createLogger } from '../../src/core/logger.js'; +import { getDatabase, closeDatabase } from '../../src/storage/database.js'; +import { SkillRepository } from '../../src/storage/skill-repository.js'; +import { SiteRepository } from '../../src/storage/site-repository.js'; +import type { SchruteConfig } from '../../src/skill/types.js'; +import type { AgentDatabase } from '../../src/storage/database.js'; +import { buildRequest } from '../../src/replay/request-builder.js'; +import { applyTransform } from '../../src/replay/transform.js'; +import { generateExport } from '../../src/skill/generator.js'; + +let config: SchruteConfig; +let db: AgentDatabase; +let skillRepo: SkillRepository; +let siteRepo: SiteRepository; + +describe('httpbin.org live integration', () => { + beforeAll(() => { + config = getConfig(); + createLogger(config.logLevel); + ensureDirectories(config); + db = getDatabase(config); + skillRepo = new SkillRepository(db); + siteRepo = new SiteRepository(db); + }); + + afterAll(() => { + closeDatabase(); + }); + + it('has learned httpbin skills from previous sessions', () => { + const skills = skillRepo.getBySiteId('httpbin.org'); + expect(skills.length).toBeGreaterThan(0); + + const ipSkill = skills.find(s => s.id === 'httpbin_org.get_ip.v1'); + expect(ipSkill).toBeDefined(); + expect(ipSkill!.status).toBe('active'); + expect(ipSkill!.method).toBe('GET'); + expect(ipSkill!.pathTemplate).toBe('/ip'); + }); + + it('buildRequest resolves a valid HTTP request for get_ip', () => { + const skill = skillRepo.getById('httpbin_org.get_ip.v1'); + expect(skill).toBeDefined(); + + const request = buildRequest(skill!, {}, 'direct'); + expect(request.url).toContain('httpbin.org/ip'); + expect(request.method).toBe('GET'); + expect(request.headers).toBeDefined(); + }); + + it('executes direct HTTP fetch against httpbin.org/ip', async () => { + const skill = skillRepo.getById('httpbin_org.get_ip.v1'); + expect(skill).toBeDefined(); + + const request = buildRequest(skill!, {}, 'direct'); + const response = await fetch(request.url, { + method: request.method, + headers: request.headers, + }); + + expect(response.status).toBe(200); + const data = await response.json(); + expect(data).toHaveProperty('origin'); + expect(typeof data.origin).toBe('string'); + expect(data.origin).toMatch(/^\d+\.\d+\.\d+\.\d+/); + }); + + it('applies jsonpath transform to live httpbin response', async () => { + const response = await fetch('https://httpbin.org/ip', { + headers: { accept: 'application/json' }, + }); + const data = await response.json(); + + const result = await applyTransform(data, { + type: 'jsonpath', + expression: '$.origin', + label: 'ip_address', + }); + + expect(result.transformApplied).toBe(true); + expect(result.label).toBe('ip_address'); + expect(typeof result.data).toBe('string'); + expect(result.data).toMatch(/^\d+\.\d+\.\d+\.\d+/); + }); + + it('applies regex transform to extract IP octets', async () => { + const response = await fetch('https://httpbin.org/ip', { + headers: { accept: 'application/json' }, + }); + const data = await response.json(); + + const result = await applyTransform(data.origin, { + type: 'regex', + expression: '(?\\d+)\\.(?\\d+)\\.(?\\d+)\\.(?\\d+)', + label: 'ip_octets', + }); + + expect(result.transformApplied).toBe(true); + const octets = result.data as Record; + expect(octets).toHaveProperty('first'); + expect(octets).toHaveProperty('second'); + expect(octets).toHaveProperty('third'); + expect(octets).toHaveProperty('fourth'); + }); + + it('generates working curl export for get_ip skill', () => { + const skill = skillRepo.getById('httpbin_org.get_ip.v1'); + expect(skill).toBeDefined(); + + const curlOutput = generateExport(skill!, 'curl'); + expect(curlOutput).toContain('curl'); + expect(curlOutput).toContain('httpbin.org/ip'); + expect(curlOutput).toContain('-X GET'); + }); + + it('generates typescript export for get_ip skill', () => { + const skill = skillRepo.getById('httpbin_org.get_ip.v1'); + expect(skill).toBeDefined(); + + const tsOutput = generateExport(skill!, 'fetch.ts'); + expect(tsOutput).toContain('fetch('); + expect(tsOutput).toContain('httpbin.org/ip'); + expect(tsOutput).toContain('await'); + }); + + it('generates python export for get_ip skill', () => { + const skill = skillRepo.getById('httpbin_org.get_ip.v1'); + expect(skill).toBeDefined(); + + const pyOutput = generateExport(skill!, 'requests.py'); + expect(pyOutput).toContain('import requests'); + expect(pyOutput).toContain('httpbin.org/ip'); + }); + + it('fetches httpbin.org/headers and gets structured response', async () => { + const response = await fetch('https://httpbin.org/headers', { + headers: { accept: 'application/json', 'x-custom': 'test-value' }, + }); + + expect(response.status).toBe(200); + const data = await response.json(); + expect(data).toHaveProperty('headers'); + expect(data.headers).toHaveProperty('X-Custom'); + expect(data.headers['X-Custom']).toBe('test-value'); + }); +}); diff --git a/tests/unit/admin-auth.test.ts b/tests/unit/admin-auth.test.ts index f5ac71b..7bf1af9 100644 --- a/tests/unit/admin-auth.test.ts +++ b/tests/unit/admin-auth.test.ts @@ -2,7 +2,7 @@ import { describe, it, expect } from 'vitest'; import { isAdminCaller } from '../../src/shared/admin-auth.js'; import type { SchruteConfig } from '../../src/skill/types.js'; -function makeConfig(network: boolean): SchruteConfig { +function makeConfig(network: boolean, mcpHttpAdmin = false): SchruteConfig { return { dataDir: '/tmp/schrute-admin-auth-test', logLevel: 'silent', @@ -22,7 +22,7 @@ function makeConfig(network: boolean): SchruteConfig { }, audit: { strictMode: true, rootHashExport: true }, storage: { maxPerSiteMb: 500, maxGlobalMb: 5000, retentionDays: 90 }, - server: { network }, + server: { network, mcpHttpAdmin }, daemon: { port: 19420, autoStart: false }, tempTtlMs: 3600000, gcIntervalMs: 900000, @@ -63,8 +63,41 @@ describe('isAdminCaller', () => { }); it('returns false for MCP HTTP sessions', () => { - expect(isAdminCaller('mcp-http-session-123', config)).toBe(false); - expect(isAdminCaller('mcp-http-abc', config)).toBe(false); + expect(isAdminCaller('mcp-http:session-123', config)).toBe(false); + expect(isAdminCaller('mcp-http:abc', config)).toBe(false); + }); + + it('returns false for unknown callerIds', () => { + expect(isAdminCaller('rest-api', config)).toBe(false); + expect(isAdminCaller('random', config)).toBe(false); + }); + }); + + describe('network=true with mcpHttpAdmin=true', () => { + const config = makeConfig(true, true); + + it('returns true for mcp-http: prefixed callerIds', () => { + expect(isAdminCaller('mcp-http:session-123', config)).toBe(true); + expect(isAdminCaller('mcp-http:unknown', config)).toBe(true); + }); + + it('still returns true for stdio and daemon', () => { + expect(isAdminCaller('stdio', config)).toBe(true); + expect(isAdminCaller('daemon', config)).toBe(true); + }); + + it('still returns false for non-mcp-http callerIds', () => { + expect(isAdminCaller('rest-api', config)).toBe(false); + expect(isAdminCaller('random', config)).toBe(false); + }); + }); + + describe('network=true with mcpHttpAdmin=false (default)', () => { + const config = makeConfig(true, false); + + it('returns false for mcp-http: prefixed callerIds', () => { + expect(isAdminCaller('mcp-http:session-123', config)).toBe(false); + expect(isAdminCaller('mcp-http:unknown', config)).toBe(false); }); }); }); diff --git a/tests/unit/agent-browser-provider.test.ts b/tests/unit/agent-browser-provider.test.ts index 62a3ff1..50a4327 100644 --- a/tests/unit/agent-browser-provider.test.ts +++ b/tests/unit/agent-browser-provider.test.ts @@ -124,3 +124,35 @@ describe('AgentBrowserProvider.evaluateFetch', () => { }); }); }); + +describe('AgentBrowserProvider.detectChallengePage', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('does not treat generic interstitial text alone as a Cloudflare challenge', async () => { + const ipc = makeIpc({ + send: vi.fn() + .mockResolvedValueOnce({ snapshot: 'Just a moment... Checking your browser' }) + .mockResolvedValueOnce({ url: 'https://example.com/interstitial' }), + }); + const provider = new AgentBrowserProvider(ipc, ['example.com']); + + const detected = await provider.detectChallengePage(); + + expect(detected).toBe(false); + }); + + it('treats Cloudflare-specific corroboration as a challenge signal', async () => { + const ipc = makeIpc({ + send: vi.fn() + .mockResolvedValueOnce({ snapshot: 'Just a moment... __cf_chl_ token present' }) + .mockResolvedValueOnce({ url: 'https://example.com/cdn-cgi/challenge-platform' }), + }); + const provider = new AgentBrowserProvider(ipc, ['example.com']); + + const detected = await provider.detectChallengePage(); + + expect(detected).toBe(true); + }); +}); diff --git a/tests/unit/browser-backend.test.ts b/tests/unit/browser-backend.test.ts index 11513ec..d81b5b7 100644 --- a/tests/unit/browser-backend.test.ts +++ b/tests/unit/browser-backend.test.ts @@ -1,4 +1,7 @@ import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'; +import * as fs from 'node:fs'; +import * as os from 'node:os'; +import * as path from 'node:path'; import type { BrowserBackend, CookieEntry } from '../../src/browser/backend.js'; // Mock logger @@ -14,6 +17,7 @@ vi.mock('../../src/core/logger.js', () => ({ // Mock child_process vi.mock('node:child_process', () => ({ execFile: vi.fn(), + execFileSync: vi.fn(), })); // Mock IPC client @@ -53,11 +57,20 @@ vi.mock('../../src/browser/agent-browser-provider.js', () => ({ import { AgentBrowserBackend } from '../../src/browser/agent-browser-backend.js'; import { execFile } from 'node:child_process'; +import { + cleanupAgentBrowserSessions, + getAgentBrowserSessionRoot, + writeAgentBrowserSessionMetadata, +} from '../../src/browser/agent-browser-cleanup.js'; import type { SchruteConfig } from '../../src/skill/types.js'; +const tempDirs: string[] = []; + function makeConfig(): SchruteConfig { + const dataDir = fs.mkdtempSync(path.join(os.tmpdir(), 'schrute-test-')); + tempDirs.push(dataDir); return { - dataDir: '/tmp/schrute-test', + dataDir, logLevel: 'silent', features: { webmcp: false, @@ -149,6 +162,9 @@ describe('AgentBrowserBackend', () => { afterEach(async () => { await backend.shutdown(); + for (const dir of tempDirs.splice(0)) { + fs.rmSync(dir, { recursive: true, force: true }); + } }); describe('ensureProbed (once-promise probe)', () => { @@ -279,7 +295,12 @@ describe('AgentBrowserBackend', () => { }; // Access private sessions map - (backendWithAuth as any).sessions.set('site1', { provider: mockProv, ipc: mockIpc }); + (backendWithAuth as any).sessions.set('site1', { + provider: mockProv, + ipc: mockIpc, + sessionName: 'exec-site1', + lastUsedAt: Date.now(), + }); await backendWithAuth.closeAndPersist('site1'); @@ -314,7 +335,12 @@ describe('AgentBrowserBackend', () => { }; const mockIpc = { close: vi.fn() }; - (backendWithAuth as any).sessions.set('site1', { provider: mockProv, ipc: mockIpc }); + (backendWithAuth as any).sessions.set('site1', { + provider: mockProv, + ipc: mockIpc, + sessionName: 'exec-site1', + lastUsedAt: Date.now(), + }); await backendWithAuth.closeAndPersist('site1'); @@ -334,7 +360,12 @@ describe('AgentBrowserBackend', () => { }; const mockIpc = { close: vi.fn() }; - (backend as any).sessions.set('site1', { provider: mockProv, ipc: mockIpc }); + (backend as any).sessions.set('site1', { + provider: mockProv, + ipc: mockIpc, + sessionName: 'exec-site1', + lastUsedAt: Date.now(), + }); await backend.discardSession('site1'); @@ -351,8 +382,18 @@ describe('AgentBrowserBackend', () => { const mockProv2 = { ...mockProvider, close: vi.fn().mockResolvedValue(undefined) }; const mockIpc2 = { close: vi.fn() }; - (backend as any).sessions.set('site1', { provider: mockProv1, ipc: mockIpc1 }); - (backend as any).sessions.set('site2', { provider: mockProv2, ipc: mockIpc2 }); + (backend as any).sessions.set('site1', { + provider: mockProv1, + ipc: mockIpc1, + sessionName: 'exec-site1', + lastUsedAt: Date.now(), + }); + (backend as any).sessions.set('site2', { + provider: mockProv2, + ipc: mockIpc2, + sessionName: 'exec-site2', + lastUsedAt: Date.now(), + }); await backend.shutdown(); @@ -362,5 +403,77 @@ describe('AgentBrowserBackend', () => { expect(mockIpc2.close).toHaveBeenCalled(); expect((backend as any).sessions.size).toBe(0); }); + + it('tracks created sessions on disk and removes metadata on shutdown', async () => { + const config = makeConfig(); + const trackedBackend = new AgentBrowserBackend(config); + + await trackedBackend.createProvider('site1', ['example.com']); + + const root = getAgentBrowserSessionRoot(config); + expect(fs.readdirSync(root)).toHaveLength(1); + + await trackedBackend.shutdown(); + + expect(fs.existsSync(root)).toBe(true); + expect(fs.readdirSync(root)).toHaveLength(0); + }); + + it('sweeps idle sessions while leaving active ones alone', async () => { + vi.useFakeTimers(); + const now = new Date('2026-03-23T12:00:00.000Z'); + vi.setSystemTime(now); + + const idleProvider = { ...mockProvider, close: vi.fn().mockResolvedValue(undefined) }; + const idleIpc = { close: vi.fn() }; + const activeProvider = { ...mockProvider, close: vi.fn().mockResolvedValue(undefined) }; + const activeIpc = { close: vi.fn() }; + + (backend as any).sessions.set('idle-site', { + provider: idleProvider, + ipc: idleIpc, + sessionName: 'exec-idle-site', + lastUsedAt: now.getTime() - 10 * 60 * 1000, + }); + (backend as any).sessions.set('active-site', { + provider: activeProvider, + ipc: activeIpc, + sessionName: 'exec-active-site', + lastUsedAt: now.getTime() - 1_000, + }); + + await backend.sweepIdleSessions(5 * 60 * 1000); + + expect(idleProvider.close).toHaveBeenCalled(); + expect(idleIpc.close).toHaveBeenCalled(); + expect((backend as any).sessions.has('idle-site')).toBe(false); + + expect(activeProvider.close).not.toHaveBeenCalled(); + expect((backend as any).sessions.has('active-site')).toBe(true); + + vi.useRealTimers(); + }); + }); + + describe('cleanupAgentBrowserSessions', () => { + it('closes tracked sessions by session name and removes stale metadata', async () => { + const config = makeConfig(); + writeAgentBrowserSessionMetadata(config, { + sessionName: 'exec-site1', + siteId: 'site1', + createdAt: Date.now(), + purpose: 'exec', + }); + + await cleanupAgentBrowserSessions(config); + + expect(mockExecFile).toHaveBeenCalledWith( + 'agent-browser', + ['--session', 'exec-site1', '--json', 'close'], + expect.objectContaining({ timeout: 10000 }), + expect.any(Function), + ); + expect(fs.readdirSync(getAgentBrowserSessionRoot(config))).toHaveLength(0); + }); }); }); diff --git a/tests/unit/browser-manager-lifecycle.test.ts b/tests/unit/browser-manager-lifecycle.test.ts index f06fd0f..81bc7c3 100644 --- a/tests/unit/browser-manager-lifecycle.test.ts +++ b/tests/unit/browser-manager-lifecycle.test.ts @@ -30,6 +30,11 @@ vi.mock('../../src/browser/cdp-connector.js', () => ({ connectViaCDP: vi.fn(), })); +vi.mock('../../src/browser/real-browser-handoff.js', () => ({ + writeOwnedBrowserLaunchMetadata: vi.fn(), + removeOwnedBrowserLaunchMetadata: vi.fn(), +})); + vi.mock('node:fs', () => ({ default: { mkdirSync: vi.fn(), @@ -49,16 +54,21 @@ vi.mock('node:fs', () => ({ import { BrowserManager } from '../../src/browser/manager.js'; import { launchBrowserEngine } from '../../src/browser/engine.js'; +import { + writeOwnedBrowserLaunchMetadata, + removeOwnedBrowserLaunchMetadata, +} from '../../src/browser/real-browser-handoff.js'; // ─── Helpers ──────────────────────────────────────────────────────── -function createMockBrowser(connected = true): Browser { +function createMockBrowser(connected = true, pid = 4321): Browser { const disconnectHandlers: Array<() => void> = []; return { isConnected: vi.fn().mockReturnValue(connected), newContext: vi.fn().mockResolvedValue(createMockContext()), contexts: vi.fn().mockReturnValue([]), close: vi.fn().mockResolvedValue(undefined), + process: vi.fn().mockReturnValue({ pid, spawnfile: '/usr/bin/chrome-headless-shell' }), on: vi.fn((event: string, handler: () => void) => { if (event === 'disconnected') disconnectHandlers.push(handler); }), @@ -230,6 +240,64 @@ describe('BrowserManager Lifecycle', () => { }); }); + describe('owned launch tracking', () => { + it('writes owned launch metadata for local launches', async () => { + const manager = new BrowserManager({ + dataDir: '/tmp/test', + browser: { idleTimeoutMs: 1000 }, + daemon: { port: 19420, autoStart: false }, + } as any); + + const mockBrowser = setupLaunchMock(createMockBrowser(true, 9876)); + await manager.launchBrowser(); + + expect(writeOwnedBrowserLaunchMetadata).toHaveBeenCalledWith( + expect.anything(), + expect.objectContaining({ + pid: 9876, + engine: 'patchright', + }), + ); + expect(mockBrowser.close).not.toHaveBeenCalled(); + }); + + it('removes owned launch metadata on closeAll', async () => { + const manager = new BrowserManager({ + dataDir: '/tmp/test', + browser: { idleTimeoutMs: 1000 }, + daemon: { port: 19420, autoStart: false }, + } as any); + + setupLaunchMock(createMockBrowser(true, 6543)); + await manager.launchBrowser(); + await manager.closeAll(); + + expect(removeOwnedBrowserLaunchMetadata).toHaveBeenCalledWith( + expect.anything(), + 6543, + ); + }); + + it('removes owned launch metadata when the browser disconnects', async () => { + const manager = new BrowserManager({ + dataDir: '/tmp/test', + browser: { idleTimeoutMs: 1000 }, + daemon: { port: 19420, autoStart: false }, + } as any); + + const mockBrowser = setupLaunchMock(createMockBrowser(true, 2468)) as Browser & { + _triggerDisconnect: () => void; + }; + await manager.launchBrowser(); + mockBrowser._triggerDisconnect(); + + expect(removeOwnedBrowserLaunchMetadata).toHaveBeenCalledWith( + expect.anything(), + 2468, + ); + }); + }); + // ─── withLease ────────────────────────────────────────────────── describe('withLease', () => { diff --git a/tests/unit/classifier.test.ts b/tests/unit/classifier.test.ts index fa367b9..4ee0ebf 100644 --- a/tests/unit/classifier.test.ts +++ b/tests/unit/classifier.test.ts @@ -111,12 +111,12 @@ describe('classifier', () => { expect(result.recommendedTier).toBe(ExecutionTier.DIRECT); }); - it('recommends COOKIE_REFRESH for auth-required traffic', () => { + it('recommends BROWSER_PROXIED for auth-required traffic', () => { const traffic = [ makeEntry({ requestHeaders: { Authorization: 'Bearer abc' } }), ]; const result = classifySite('test.com', traffic); - expect(result.recommendedTier).toBe(ExecutionTier.COOKIE_REFRESH); + expect(result.recommendedTier).toBe(ExecutionTier.BROWSER_PROXIED); }); it('recommends FULL_BROWSER for JS-computed fields (signatures)', () => { @@ -127,7 +127,7 @@ describe('classifier', () => { expect(result.recommendedTier).toBe(ExecutionTier.FULL_BROWSER); }); - it('recommends COOKIE_REFRESH for dynamic fields + auth', () => { + it('recommends BROWSER_PROXIED for dynamic fields + auth', () => { const traffic = [ makeEntry({ url: 'https://api.example.com/api?_ts=12345', @@ -135,7 +135,7 @@ describe('classifier', () => { }), ]; const result = classifySite('test.com', traffic); - expect(result.recommendedTier).toBe(ExecutionTier.COOKIE_REFRESH); + expect(result.recommendedTier).toBe(ExecutionTier.BROWSER_PROXIED); }); }); diff --git a/tests/unit/cli-export-import.test.ts b/tests/unit/cli-export-import.test.ts index 4e9b4ee..c43c6a1 100644 --- a/tests/unit/cli-export-import.test.ts +++ b/tests/unit/cli-export-import.test.ts @@ -121,6 +121,7 @@ function makeBundle() { domainAllowlist: [], redactionRules: [], capabilities: [], + browserRequired: false, }, }; } @@ -273,4 +274,14 @@ describe('Import bundle', () => { expect(restored.skills[0].name).toBe(bundle.skills[0].name); expect(restored.policy.siteId).toBe(bundle.policy.siteId); }); + + it('roundtrips browserRequired in policy bundles', () => { + const bundle = makeBundle(); + bundle.policy.browserRequired = true; + + const restored = JSON.parse(JSON.stringify(bundle)); + + expect(restored.policy.browserRequired).toBe(true); + expect(restored.policy.siteId).toBe('example.com'); + }); }); diff --git a/tests/unit/cloudflare-detection.test.ts b/tests/unit/cloudflare-detection.test.ts index b94f413..6426487 100644 --- a/tests/unit/cloudflare-detection.test.ts +++ b/tests/unit/cloudflare-detection.test.ts @@ -15,12 +15,16 @@ import { detectAndWaitForChallenge } from '../../src/browser/base-browser-adapte function createMockPage(overrides: { evaluateResult?: boolean; titleResult?: string; + contentResult?: string; + urlResult?: string; waitForFunctionResolves?: boolean; waitForFunctionDelay?: number; } = {}): Page { const { evaluateResult = false, titleResult = 'Example Page', + contentResult = `${titleResult}`, + urlResult = 'https://example.com/', waitForFunctionResolves = true, waitForFunctionDelay = 0, } = overrides; @@ -28,6 +32,8 @@ function createMockPage(overrides: { return { evaluate: vi.fn().mockResolvedValue(evaluateResult), title: vi.fn().mockResolvedValue(titleResult), + content: vi.fn().mockResolvedValue(contentResult), + url: vi.fn().mockReturnValue(urlResult), waitForFunction: vi.fn().mockImplementation(() => { if (!waitForFunctionResolves) { return Promise.reject(new Error('Timeout')); @@ -59,6 +65,7 @@ describe('detectAndWaitForChallenge', () => { const page = createMockPage({ evaluateResult: false, titleResult: 'Just a moment...', + contentResult: 'Just a moment...__cf_chl_ token', }); const result = await detectAndWaitForChallenge(page); @@ -71,6 +78,20 @@ describe('detectAndWaitForChallenge', () => { expect(ctx.hadSelectors).toBe(false); }); + it('should NOT detect a generic challenge title without Cloudflare-specific corroboration', async () => { + const page = createMockPage({ + evaluateResult: false, + titleResult: 'Just a moment...', + contentResult: 'Just a moment...Please wait', + urlResult: 'https://example.com/interstitial', + }); + + const result = await detectAndWaitForChallenge(page); + + expect(result).toBe(false); + expect(page.waitForFunction).not.toHaveBeenCalled(); + }); + it('should NOT detect generic "Attention Required" without "Cloudflare"', async () => { const page = createMockPage({ evaluateResult: false, @@ -115,6 +136,7 @@ describe('detectAndWaitForChallenge', () => { const page = createMockPage({ evaluateResult: false, titleResult: 'Verify you are human', + contentResult: 'Verify you are human__cf_chl_ present', }); const result = await detectAndWaitForChallenge(page); @@ -198,17 +220,13 @@ describe('detectAndWaitForChallenge selector vs title resolution', () => { }); describe('challenge-aware snapshot content', () => { - it('Cloudflare title regex matches challenge page titles', () => { - const regex = /^Just a moment\b|Attention Required!.*Cloudflare|Verify you are human/i; - - // These should trigger challenge-aware snapshot content - expect(regex.test('Just a moment...')).toBe(true); - expect(regex.test('Verify you are human')).toBe(true); - expect(regex.test('Attention Required! | Cloudflare')).toBe(true); + it('treats only explicit Cloudflare branding as sufficient title-only evidence', () => { + const explicitCloudflareTitle = /Attention Required!.*Cloudflare/i; - // These should NOT trigger - expect(regex.test('My Normal Page')).toBe(false); - expect(regex.test('Attention Required!')).toBe(false); + expect(explicitCloudflareTitle.test('Attention Required! | Cloudflare')).toBe(true); + expect(explicitCloudflareTitle.test('Just a moment...')).toBe(false); + expect(explicitCloudflareTitle.test('Verify you are human')).toBe(false); + expect(explicitCloudflareTitle.test('Attention Required!')).toBe(false); }); it('engine hint is only shown for vanilla playwright engine', () => { @@ -272,37 +290,26 @@ describe('detectAndWaitForChallenge pre-check error handling', () => { }); }); -describe('Cloudflare challenge title regex', () => { - const regex = /^Just a moment\b|Attention Required!.*Cloudflare|Verify you are human/i; - - it('should match "Just a moment..."', () => { - expect(regex.test('Just a moment...')).toBe(true); - }); - - it('should match "Just a moment" exactly', () => { - expect(regex.test('Just a moment')).toBe(true); - }); - - it('should match "Attention Required! | Cloudflare"', () => { - expect(regex.test('Attention Required! | Cloudflare')).toBe(true); - }); - - it('should match "Verify you are human"', () => { - expect(regex.test('Verify you are human')).toBe(true); - }); +describe('Cloudflare challenge title heuristics', () => { + const explicitCloudflareTitle = /Attention Required!.*Cloudflare/i; + const genericChallengeTitle = /^Just a moment\b|Verify you are human/i; - it('should NOT match "Attention Required!" without Cloudflare', () => { - expect(regex.test('Attention Required!')).toBe(false); + it('matches explicit Cloudflare-branded challenge titles', () => { + expect(explicitCloudflareTitle.test('Attention Required! | Cloudflare')).toBe(true); }); - it('should NOT match generic titles', () => { - expect(regex.test('Google')).toBe(false); - expect(regex.test('My Website')).toBe(false); + it('treats generic challenge titles as generic, not decisive, signals', () => { + expect(genericChallengeTitle.test('Just a moment...')).toBe(true); + expect(genericChallengeTitle.test('Verify you are human')).toBe(true); + expect(explicitCloudflareTitle.test('Just a moment...')).toBe(false); + expect(explicitCloudflareTitle.test('Verify you are human')).toBe(false); }); - it('should NOT match "Just another moment"', () => { - // \b ensures "Just a moment" is at a word boundary - expect(regex.test('Just a momentary pause')).toBe(false); + it('does not match unrelated titles', () => { + expect(explicitCloudflareTitle.test('Attention Required!')).toBe(false); + expect(genericChallengeTitle.test('Google')).toBe(false); + expect(genericChallengeTitle.test('My Website')).toBe(false); + expect(genericChallengeTitle.test('Just a momentary pause')).toBe(false); }); }); diff --git a/tests/unit/config-env.test.ts b/tests/unit/config-env.test.ts index 5f66b6e..667e9a9 100644 --- a/tests/unit/config-env.test.ts +++ b/tests/unit/config-env.test.ts @@ -15,7 +15,27 @@ vi.mock('../../src/core/logger.js', () => ({ // We need to import config functions fresh each time because of cached state let configModule: typeof import('../../src/core/config.js'); +const SCHRUTE_ENV_KEYS = [ + 'SCHRUTE_DATA_DIR', + 'SCHRUTE_LOG_LEVEL', + 'SCHRUTE_AUTH_TOKEN', + 'SCHRUTE_NETWORK', + 'SCHRUTE_HTTP_TRANSPORT', + 'SCHRUTE_HTTP_PORT', + 'SCHRUTE_BROWSER_ENGINE', + 'SCHRUTE_SNAPSHOT_MODE', + 'SCHRUTE_INCREMENTAL_DIFFS', +] as const; + +let savedEnv: Record; + beforeEach(async () => { + // Save current SCHRUTE_* env vars (including global-setup's SCHRUTE_DATA_DIR) + savedEnv = {}; + for (const key of SCHRUTE_ENV_KEYS) { + savedEnv[key] = process.env[key]; + } + vi.resetModules(); vi.mock('../../src/core/logger.js', () => ({ getLogger: () => ({ @@ -26,30 +46,23 @@ beforeEach(async () => { }), })); // Clean env vars before each test - delete process.env.SCHRUTE_DATA_DIR; - delete process.env.SCHRUTE_LOG_LEVEL; - delete process.env.SCHRUTE_AUTH_TOKEN; - delete process.env.SCHRUTE_NETWORK; - delete process.env.SCHRUTE_HTTP_TRANSPORT; - delete process.env.SCHRUTE_HTTP_PORT; - delete process.env.SCHRUTE_BROWSER_ENGINE; - delete process.env.SCHRUTE_SNAPSHOT_MODE; - delete process.env.SCHRUTE_INCREMENTAL_DIFFS; + for (const key of SCHRUTE_ENV_KEYS) { + delete process.env[key]; + } configModule = await import('../../src/core/config.js'); configModule.resetConfigCache(); }); afterEach(() => { - delete process.env.SCHRUTE_DATA_DIR; - delete process.env.SCHRUTE_LOG_LEVEL; - delete process.env.SCHRUTE_AUTH_TOKEN; - delete process.env.SCHRUTE_NETWORK; - delete process.env.SCHRUTE_HTTP_TRANSPORT; - delete process.env.SCHRUTE_HTTP_PORT; - delete process.env.SCHRUTE_BROWSER_ENGINE; - delete process.env.SCHRUTE_SNAPSHOT_MODE; - delete process.env.SCHRUTE_INCREMENTAL_DIFFS; + // Restore saved env vars (preserves global-setup's SCHRUTE_DATA_DIR) + for (const key of SCHRUTE_ENV_KEYS) { + if (savedEnv[key] !== undefined) { + process.env[key] = savedEnv[key]; + } else { + delete process.env[key]; + } + } }); describe('config env overrides', () => { diff --git a/tests/unit/config.test.ts b/tests/unit/config.test.ts index 5282e2a..f0d7b4e 100644 --- a/tests/unit/config.test.ts +++ b/tests/unit/config.test.ts @@ -86,6 +86,23 @@ describe('config', () => { fs.unlinkSync(tmpPath); } }); + + it('throws when server.mcpHttpAdmin is a non-boolean (e.g. string "false")', () => { + const tmpPath = path.join('/tmp', `schrute-test-config-${Date.now()}.json`); + fs.writeFileSync( + tmpPath, + JSON.stringify({ server: { mcpHttpAdmin: 'false' } }), + 'utf-8', + ); + + try { + expect(() => configModule.loadConfig(tmpPath)).toThrow( + 'Invalid config: server.mcpHttpAdmin must be a boolean', + ); + } finally { + fs.unlinkSync(tmpPath); + } + }); }); describe('deepMerge (tested via loadConfig)', () => { diff --git a/tests/unit/doctor.test.ts b/tests/unit/doctor.test.ts index 46aec02..95a2410 100644 --- a/tests/unit/doctor.test.ts +++ b/tests/unit/doctor.test.ts @@ -122,6 +122,29 @@ describe('doctor', () => { }); }); + describe('audit hash chain check', () => { + it('broken audit hash chain returns warning (not fail)', () => { + const check: CheckResult = { + name: 'audit_hash_chain', + status: 'warning', + message: 'Audit hash chain broken at entry 5 (10 entries)', + details: '(expected when database is shared across dev sessions or keychain key was rotated)', + }; + expect(check.status).toBe('warning'); + expect(check.message).toContain('broken at entry'); + expect(check.details).toContain('expected when database is shared'); + }); + + it('intact audit hash chain returns pass', () => { + const check: CheckResult = { + name: 'audit_hash_chain', + status: 'pass', + message: 'Audit hash chain intact (10 entries)', + }; + expect(check.status).toBe('pass'); + }); + }); + describe('durable storage clean check', () => { it('check result passes when no raw artifacts exist', () => { const check: CheckResult = { diff --git a/tests/unit/dry-run.test.ts b/tests/unit/dry-run.test.ts index 710b6cc..556c07f 100644 --- a/tests/unit/dry-run.test.ts +++ b/tests/unit/dry-run.test.ts @@ -115,7 +115,7 @@ describe('dry-run', () => { expect(result.volatilityReport).toBeDefined(); expect(result.volatilityReport).toHaveLength(1); expect(result.tierDecision).toContain('tierLock.type=permanent'); - expect(result.tierDecision).toContain('tierLock.reason=js_computed_field'); + expect(result.tierDecision).toContain('tierLock.reason=JS-computed field'); }); it('does not include volatility report in agent-safe mode', async () => { diff --git a/tests/unit/engine-batch.test.ts b/tests/unit/engine-batch.test.ts new file mode 100644 index 0000000..c13e1d1 --- /dev/null +++ b/tests/unit/engine-batch.test.ts @@ -0,0 +1,277 @@ +import { describe, expect, it, vi, beforeEach } from 'vitest'; +import { Engine } from '../../src/core/engine.js'; +import { SideEffectClass, type SkillSpec } from '../../src/skill/types.js'; +import { getSitePolicy } from '../../src/core/policy.js'; + +vi.mock('../../src/core/logger.js', () => ({ + getLogger: () => ({ + info: vi.fn(), + warn: vi.fn(), + error: vi.fn(), + debug: vi.fn(), + }), +})); + +vi.mock('../../src/core/policy.js', () => ({ + getSitePolicy: vi.fn(() => ({ + executionBackend: 'agent-browser', + executionSessionName: undefined, + maxConcurrent: 3, + })), + checkCapability: vi.fn(), + enforceDomainAllowlist: vi.fn(), + checkMethodAllowed: vi.fn(), + checkPathRisk: vi.fn(), + mergeSitePolicy: vi.fn(), +})); + +function makeSkill(id: string, sideEffectClass = SideEffectClass.READ_ONLY, siteId = 'example.com'): SkillSpec { + return { + id, + version: 1, + status: 'active', + currentTier: 'tier_1', + tierLock: null, + allowedDomains: [siteId], + requiredCapabilities: [], + parameters: [], + validation: { semanticChecks: [], customInvariants: [] }, + redaction: { piiClassesFound: [], fieldsRedacted: 0 }, + replayStrategy: 'prefer_tier_1', + sideEffectClass, + sampleCount: 1, + consecutiveValidations: 1, + confidence: 1, + method: 'GET', + pathTemplate: `/${id}`, + inputSchema: {}, + isComposite: false, + siteId, + name: id, + successRate: 1, + createdAt: Date.now(), + updatedAt: Date.now(), + } as SkillSpec; +} + +function createBatchEngine( + skills: Record, + executeSkillImpl: (skillId: string, params: Record) => Promise, +): Engine { + const engine = Object.create(Engine.prototype) as Engine & { + skillRepo: { getById: (id: string) => SkillSpec | undefined }; + config: Record; + executeSkill: (skillId: string, params: Record, callerId?: string) => Promise; + executionBackendGroupIds: WeakMap; + nextExecutionBackendGroupId: number; + }; + + engine.skillRepo = { + getById: (id: string) => skills[id], + }; + engine.config = { + browser: { + execution: { + backend: 'agent-browser', + }, + }, + }; + engine.executionBackendGroupIds = new WeakMap(); + engine.nextExecutionBackendGroupId = 0; + engine.executeSkill = vi.fn((skillId: string, params: Record) => executeSkillImpl(skillId, params)) as unknown as Engine['executeSkill']; + return engine; +} + +function delay(ms: number): Promise { + return new Promise((resolve) => setTimeout(resolve, ms)); +} + +describe('Engine.executeBatch', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('resolves execution group keys from actual backend routing', () => { + const engine = createBatchEngine({}, async () => ({ success: true, latencyMs: 1 })) as Engine; + const routedBackend = {} as any; + const alternateBackend = {} as any; + const getExecutionBackend = vi.fn(); + (engine as Engine & { getExecutionBackend: typeof getExecutionBackend }).getExecutionBackend = getExecutionBackend as unknown as Engine['getExecutionBackend']; + + getExecutionBackend.mockReturnValue(routedBackend); + const first = engine.resolveExecutionGroupKey(makeSkill('skill-a')); + + (getSitePolicy as unknown as ReturnType).mockReturnValue({ + executionBackend: 'live-chrome', + executionSessionName: 'shared-session', + }); + const second = engine.resolveExecutionGroupKey(makeSkill('skill-b')); + + getExecutionBackend.mockReturnValue(alternateBackend); + const third = engine.resolveExecutionGroupKey(makeSkill('skill-c')); + + expect(getExecutionBackend).toHaveBeenCalledTimes(3); + expect(first).toBe(second); + expect(third).not.toBe(first); + }); + + it('preserves result order and enforces write barriers', async () => { + const skills = { + r1: makeSkill('r1'), + r2: makeSkill('r2'), + w1: makeSkill('w1', SideEffectClass.NON_IDEMPOTENT), + r3: makeSkill('r3'), + r4: makeSkill('r4'), + }; + const events: string[] = []; + const engine = createBatchEngine(skills, async (skillId) => { + events.push(`start:${skillId}`); + await delay(skillId.startsWith('r') ? 20 : 5); + events.push(`end:${skillId}`); + return { success: true, data: { skillId }, latencyMs: 1 }; + }); + + const results = await engine.executeBatch([ + { skillId: 'r1' }, + { skillId: 'r2' }, + { skillId: 'w1' }, + { skillId: 'r3' }, + { skillId: 'r4' }, + ]); + + expect(results.map((result) => result.skillId)).toEqual(['r1', 'r2', 'w1', 'r3', 'r4']); + expect(events.indexOf('end:r1')).toBeLessThan(events.indexOf('start:w1')); + expect(events.indexOf('end:r2')).toBeLessThan(events.indexOf('start:w1')); + expect(events.indexOf('end:w1')).toBeLessThan(events.indexOf('start:r3')); + }); + + it('caps read-window concurrency at 3 per execution group', async () => { + const skills = Object.fromEntries( + ['r1', 'r2', 'r3', 'r4', 'r5'].map((id) => [id, makeSkill(id)]), + ) as Record; + let active = 0; + let maxActive = 0; + + const engine = createBatchEngine(skills, async () => { + active += 1; + maxActive = Math.max(maxActive, active); + await delay(20); + active -= 1; + return { success: true, latencyMs: 1 }; + }); + + await engine.executeBatch([ + { skillId: 'r1' }, + { skillId: 'r2' }, + { skillId: 'r3' }, + { skillId: 'r4' }, + { skillId: 'r5' }, + ]); + + expect(maxActive).toBe(3); + }); + + it('uses per-site policy maxConcurrent for read-window concurrency', async () => { + (getSitePolicy as unknown as ReturnType).mockReturnValue({ + executionBackend: 'agent-browser', + executionSessionName: undefined, + maxConcurrent: 2, + }); + + const skills = Object.fromEntries( + ['r1', 'r2', 'r3', 'r4'].map((id) => [id, makeSkill(id)]), + ) as Record; + let active = 0; + let maxActive = 0; + + const engine = createBatchEngine(skills, async () => { + active += 1; + maxActive = Math.max(maxActive, active); + await delay(20); + active -= 1; + return { success: true, latencyMs: 1 }; + }); + + await engine.executeBatch([ + { skillId: 'r1' }, + { skillId: 'r2' }, + { skillId: 'r3' }, + { skillId: 'r4' }, + ]); + + expect(maxActive).toBe(2); + }); + + it('retries rate-limited actions once before returning the final result', async () => { + const skills = { + r1: makeSkill('r1'), + }; + const executeSkill = vi.fn() + .mockResolvedValueOnce({ + success: false, + failureCause: 'rate_limited', + failureDetail: 'Retry after 100ms', + error: 'rate limited', + latencyMs: 1, + }) + .mockResolvedValueOnce({ + success: true, + data: { ok: true }, + latencyMs: 1, + }); + const engine = createBatchEngine(skills, executeSkill); + + const results = await engine.executeBatch([{ skillId: 'r1' }]); + expect(results).toEqual([expect.objectContaining({ skillId: 'r1', success: true, data: { ok: true } })]); + expect(executeSkill).toHaveBeenCalledTimes(2); + }); + + it('preserves browser handoff metadata in batch results', async () => { + const skills = { + r1: makeSkill('r1'), + }; + const executeSkill = vi.fn().mockResolvedValue({ + success: false, + status: 'browser_handoff_required', + reason: 'cloudflare_challenge', + recoveryMode: 'real_browser_cdp', + siteId: 'example.com', + url: 'https://example.com/challenge', + hint: 'Complete challenge', + resumeToken: 'recover-token', + managedBrowser: true, + latencyMs: 1, + }); + const engine = createBatchEngine(skills, executeSkill); + + const results = await engine.executeBatch([{ skillId: 'r1' }]); + + expect(results[0]).toEqual(expect.objectContaining({ + skillId: 'r1', + success: false, + status: 'browser_handoff_required', + resumeToken: 'recover-token', + managedBrowser: true, + recoveryMode: 'real_browser_cdp', + })); + }); + + it('builds workflow step executors that request wait-for-permit pacing', async () => { + const engine = createBatchEngine({}, async () => ({ success: true, latencyMs: 1 })); + + const executor = (engine as any).buildWorkflowStepExecutor('caller-1'); + await executor('workflow-step', { query: 'ada' }); + + expect(engine.executeSkill).toHaveBeenCalledWith( + 'workflow-step', + { query: 'ada' }, + 'caller-1', + { + skipTransform: true, + waitForPermit: { + timeoutMs: 30_000, + }, + }, + ); + }); +}); diff --git a/tests/unit/engine-capture.test.ts b/tests/unit/engine-capture.test.ts index 4b77330..7a21e99 100644 --- a/tests/unit/engine-capture.test.ts +++ b/tests/unit/engine-capture.test.ts @@ -99,6 +99,7 @@ vi.mock('../../src/replay/tool-budget.js', () => ({ vi.mock('../../src/automation/rate-limiter.js', () => ({ RateLimiter: vi.fn().mockImplementation(() => ({ checkRate: vi.fn().mockReturnValue({ allowed: true }), + waitForPermit: vi.fn().mockResolvedValue({ allowed: true }), recordResponse: vi.fn(), setQps: vi.fn(), attachDatabase: vi.fn(), @@ -228,6 +229,14 @@ vi.mock('../../src/core/tiering.js', () => ({ tierLock: { type: 'permanent', reason: 'js_computed_field', evidence: 'test' }, reason: 'test', }), + checkPromotion: vi.fn().mockReturnValue({ promote: false, reason: 'test' }), + getEffectiveTier: vi.fn().mockImplementation((skill: any) => skill.currentTier), + sanitizeSiteRecommendedTier: vi.fn().mockImplementation((recommendedTier: string, browserRequired: boolean) => { + if (browserRequired) { + return recommendedTier === 'full_browser' ? 'full_browser' : 'browser_proxied'; + } + return recommendedTier === 'cookie_refresh' ? 'browser_proxied' : recommendedTier; + }), })); // Mock diff-engine diff --git a/tests/unit/engine.test.ts b/tests/unit/engine.test.ts index e1929df..c784b0e 100644 --- a/tests/unit/engine.test.ts +++ b/tests/unit/engine.test.ts @@ -54,6 +54,7 @@ vi.mock('../../src/storage/skill-repository.js', () => ({ delete: vi.fn(), updateConfidence: vi.fn(), updateTier: vi.fn(), + incrementValidationsSinceLastCanary: vi.fn(), })), })); @@ -102,6 +103,7 @@ vi.mock('../../src/replay/tool-budget.js', () => ({ vi.mock('../../src/automation/rate-limiter.js', () => ({ RateLimiter: vi.fn().mockImplementation(() => ({ checkRate: vi.fn().mockReturnValue({ allowed: true }), + waitForPermit: vi.fn().mockResolvedValue({ allowed: true }), recordResponse: vi.fn(), setQps: vi.fn(), attachDatabase: vi.fn(), @@ -148,6 +150,9 @@ vi.mock('../../src/browser/base-browser-adapter.js', () => ({ })); const mockCleanupManagedChromeLaunches = vi.fn().mockResolvedValue(undefined); +const mockCleanupManagedChromeLaunchesSync = vi.fn(); +const mockCleanupOwnedBrowserLaunches = vi.fn().mockResolvedValue(undefined); +const mockCleanupOwnedBrowserLaunchesSync = vi.fn(); const mockLaunchManagedChrome = vi.fn(); const mockRemoveManagedChromeMetadata = vi.fn(); const mockTerminateManagedChrome = vi.fn().mockResolvedValue(undefined); @@ -156,6 +161,9 @@ const mockWriteManagedChromeMetadata = vi.fn(); const mockListManagedChromeMetadata = vi.fn().mockReturnValue([]); vi.mock('../../src/browser/real-browser-handoff.js', () => ({ cleanupManagedChromeLaunches: (...args: unknown[]) => mockCleanupManagedChromeLaunches(...args), + cleanupManagedChromeLaunchesSync: (...args: unknown[]) => mockCleanupManagedChromeLaunchesSync(...args), + cleanupOwnedBrowserLaunches: (...args: unknown[]) => mockCleanupOwnedBrowserLaunches(...args), + cleanupOwnedBrowserLaunchesSync: (...args: unknown[]) => mockCleanupOwnedBrowserLaunchesSync(...args), launchManagedChrome: (...args: unknown[]) => mockLaunchManagedChrome(...args), removeManagedChromeMetadata: (...args: unknown[]) => mockRemoveManagedChromeMetadata(...args), terminateManagedChrome: (...args: unknown[]) => mockTerminateManagedChrome(...args), @@ -214,6 +222,10 @@ vi.mock('../../src/replay/executor.js', () => ({ vi.mock('../../src/replay/retry.js', () => ({ retryWithEscalation: vi.fn(), })); +const mockExecuteWorkflow = vi.fn(); +vi.mock('../../src/replay/workflow-executor.js', () => ({ + executeWorkflow: (...args: unknown[]) => mockExecuteWorkflow(...args), +})); vi.mock('../../src/automation/cookie-refresh.js', () => ({ refreshCookies: vi.fn().mockResolvedValue(undefined), })); @@ -277,6 +289,12 @@ vi.mock('../../src/core/tiering.js', () => ({ }), checkPromotion: vi.fn().mockReturnValue({ promote: false, reason: 'test' }), getEffectiveTier: vi.fn().mockImplementation((skill: any) => skill.currentTier), + sanitizeSiteRecommendedTier: vi.fn().mockImplementation((recommendedTier: string, browserRequired: boolean) => { + if (browserRequired) { + return recommendedTier === 'full_browser' ? 'full_browser' : 'browser_proxied'; + } + return recommendedTier === 'cookie_refresh' ? 'browser_proxied' : recommendedTier; + }), })); // Mock diff-engine @@ -344,7 +362,7 @@ vi.mock('node:fs', async () => { import { Engine, buildEnforcementSchema } from '../../src/core/engine.js'; import { ContextOverrideMismatchError } from '../../src/browser/manager.js'; import { PlaywrightMcpAdapter } from '../../src/browser/playwright-mcp-adapter.js'; -import { checkMethodAllowed, checkPathRisk, mergeSitePolicy } from '../../src/core/policy.js'; +import { checkMethodAllowed, checkPathRisk, getSitePolicy, mergeSitePolicy } from '../../src/core/policy.js'; import { SkillRepository } from '../../src/storage/skill-repository.js'; import { MetricsRepository } from '../../src/storage/metrics-repository.js'; import { retryWithEscalation } from '../../src/replay/retry.js'; @@ -365,6 +383,52 @@ describe('Engine', () => { beforeEach(() => { vi.clearAllMocks(); + (checkMethodAllowed as ReturnType).mockReturnValue(true); + (checkPathRisk as ReturnType).mockReturnValue({ blocked: false }); + (getSitePolicy as ReturnType).mockReturnValue({ + domainAllowlist: ['example.com'], + capabilities: [], + browserRequired: false, + }); + (retryWithEscalation as ReturnType).mockReset(); + (retryWithEscalation as ReturnType).mockResolvedValue({ + success: true, + tier: 'direct', + status: 200, + data: { id: 1 }, + rawBody: '{"id":1}', + headers: { 'content-type': 'application/json' }, + latencyMs: 10, + schemaMatch: true, + semanticPass: true, + retryDecisions: [], + }); + mockExecuteWorkflow.mockReset(); + mockExecuteWorkflow.mockResolvedValue({ + success: true, + data: { done: true }, + stepResults: [], + totalLatencyMs: 1, + }); + (checkPromotion as ReturnType).mockReturnValue({ promote: false, reason: 'test' }); + (handleFailure as ReturnType).mockReturnValue({ + newTier: 'tier_3', + tierLock: { type: 'permanent', reason: 'js_computed_field', evidence: 'test' }, + reason: 'test', + }); + (detectDrift as ReturnType).mockReturnValue({ drifted: false, breaking: false, changes: [] }); + (monitorSkills as ReturnType).mockReturnValue([ + { skillId: 'test', status: 'healthy', successRate: 1.0, trend: 0, windowSize: 0 }, + ]); + (notify as ReturnType).mockResolvedValue(undefined); + (createEvent as ReturnType).mockReturnValue({ type: 'test', skillId: 'test', siteId: 'test', details: {}, timestamp: Date.now() }); + (inferSchema as ReturnType).mockReturnValue({ + type: 'object', + properties: { id: { type: 'integer' } }, + required: ['id'], + }); + (mergeSchemas as ReturnType).mockImplementation((a: any, b: any) => ({ ...a, ...b })); + mockSiteRepoInstance.getById.mockReturnValue(undefined); const defaultPage = { on: vi.fn(), off: vi.fn(), @@ -396,6 +460,8 @@ describe('Engine', () => { mockDetectAndWaitForChallenge.mockResolvedValue(false); mockIsCloudflareChallengePage.mockResolvedValue(false); mockListManagedChromeMetadata.mockReturnValue([]); + mockCleanupManagedChromeLaunchesSync.mockReset(); + mockCleanupOwnedBrowserLaunchesSync.mockReset(); engine = new Engine(makeConfig()); }); @@ -710,6 +776,52 @@ describe('Engine', () => { expect(engine.getStatus().mode).toBe('recording'); expect(engine.getStatus().currentRecording?.name).toBe('recording-in-progress'); }); + + it('cleans up launched Chrome when recovery CDP attach fails after launch', async () => { + const recovery = (engine as any).upsertPendingRecovery( + 'example.com', + 'https://example.com/cdn-cgi/challenge-platform', + ); + const msm = engine.getMultiSessionManager(); + vi.spyOn(msm, 'connectCDP') + .mockRejectedValueOnce(new Error('auto discover failed')) + .mockRejectedValueOnce(new Error('launch attach failed')); + mockLaunchManagedChrome.mockResolvedValueOnce({ + pid: 4242, + profileDir: recovery.managedProfileDir, + wsEndpoint: 'ws://127.0.0.1:9222/devtools/browser/test', + browserBinary: '/usr/bin/google-chrome', + }); + + await expect((engine as any).connectRecoverySession(recovery)).rejects.toThrow('launch attach failed'); + + expect(mockTerminateManagedChrome).toHaveBeenCalledWith(4242); + expect(mockRemoveManagedChromeMetadata).toHaveBeenCalledWith(recovery.managedProfileDir); + expect(recovery.managedPid).toBeUndefined(); + expect(recovery.managedBrowser).toBe(false); + }); + + it('keeps explore and execute recoveries separate for the same site', () => { + const exploreRecovery = (engine as any).upsertPendingRecovery( + 'example.com', + 'https://example.com/cdn-cgi/challenge-platform', + undefined, + 'explore', + ); + const executeRecovery = (engine as any).upsertPendingRecovery( + 'example.com', + 'https://api.example.com/blocked', + undefined, + 'execute', + ); + + expect(executeRecovery.resumeToken).not.toBe(exploreRecovery.resumeToken); + expect(executeRecovery.cdpSessionName).not.toBe(exploreRecovery.cdpSessionName); + expect(exploreRecovery.url).toBe('https://example.com/cdn-cgi/challenge-platform'); + expect(executeRecovery.url).toBe('https://api.example.com/blocked'); + expect((engine as any).getRecoveryBySiteId('example.com', 'explore')?.resumeToken).toBe(exploreRecovery.resumeToken); + expect((engine as any).getRecoveryBySiteId('example.com', 'execute')?.resumeToken).toBe(executeRecovery.resumeToken); + }); }); // ─── State Machine: recording -> exploring (stopRecording) ───── @@ -858,6 +970,132 @@ describe('Engine', () => { expect(retryWithEscalation).toHaveBeenCalled(); }); + it('returns transformed data when a skill has an outputTransform', async () => { + const mockSkill = { + id: 'example.com.get_price.v1', + siteId: 'example.com', + name: 'get_price', + method: 'GET', + pathTemplate: '/api/price', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_1', + tierLock: null, + authType: undefined, + outputTransform: { + type: 'jsonpath', + expression: '$.stats.current', + label: 'current_price', + }, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + (checkMethodAllowed as ReturnType).mockReturnValueOnce(true); + (checkPathRisk as ReturnType).mockReturnValueOnce({ blocked: false }); + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: true, + tier: 'direct', + status: 200, + data: { stats: { current: 123.45 } }, + rawBody: '{"stats":{"current":123.45}}', + headers: { 'content-type': 'application/json' }, + latencyMs: 42, + schemaMatch: true, + semanticPass: true, + retryDecisions: [], + }); + + const result = await engine.executeSkill('example.com.get_price.v1', {}); + expect(result.success).toBe(true); + expect(result.data).toBe(123.45); + expect(result.transformApplied).toBe(true); + expect(result.transformLabel).toBe('current_price'); + }); + + it('threads callerId and skipTransform through workflow step execution', async () => { + const outerWorkflowSkill = { + id: 'example.com.outer_workflow.v1', + siteId: 'example.com', + name: 'outer_workflow', + method: 'GET', + pathTemplate: '/__workflow/outer-workflow', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_1', + tierLock: null, + authType: undefined, + workflowSpec: { + steps: [{ skillId: 'example.com.inner_step.v1' }], + }, + outputTransform: { + type: 'jsonpath', + expression: '$.payload.value', + }, + confidence: 0.5, + consecutiveValidations: 2, + }; + const innerStepSkill = { + id: 'example.com.inner_step.v1', + siteId: 'example.com', + name: 'inner_step', + method: 'GET', + pathTemplate: '/api/inner', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_1', + tierLock: null, + authType: undefined, + outputTransform: { + type: 'jsonpath', + expression: '$.payload.value', + }, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById + .mockReturnValueOnce(outerWorkflowSkill) + .mockReturnValueOnce(innerStepSkill); + } + + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: true, + tier: 'direct', + status: 200, + data: { payload: { value: 42 } }, + rawBody: '{"payload":{"value":42}}', + headers: { 'content-type': 'application/json' }, + latencyMs: 10, + schemaMatch: true, + semanticPass: true, + retryDecisions: [], + }); + + let innerResult: unknown; + mockExecuteWorkflow.mockImplementationOnce(async (_spec, _params, executeStep) => { + innerResult = await executeStep('example.com.inner_step.v1', {}); + return { + success: true, + data: { payload: { value: 42 } }, + stepResults: [], + totalLatencyMs: 12, + }; + }); + + const executeSkillSpy = vi.spyOn(engine, 'executeSkill'); + const result = await engine.executeSkill('example.com.outer_workflow.v1', {}, 'caller-123'); + + expect(innerResult).toMatchObject({ data: { payload: { value: 42 } } }); + // First call is the outer workflow; second is the inner step from the workflow executor + const innerCall = executeSkillSpy.mock.calls.find(c => c[0] === 'example.com.inner_step.v1'); + expect(innerCall).toBeDefined(); + expect(innerCall![2]).toBe('caller-123'); + expect(innerCall![3]).toMatchObject({ skipTransform: true }); + expect(result.success).toBe(true); + expect(result.data).toBe(42); + }); + it('promotes tier_3 skill after direct success when site recommends direct', async () => { const mockSkill = { id: 'example.com.promo_test.v1', @@ -1002,6 +1240,376 @@ describe('Engine', () => { })); }); + it('masks direct-first startup when policy.browserRequired is true', async () => { + const mockSkill = { + id: 'example.com.masked_direct.v1', + siteId: 'example.com', + name: 'masked_direct', + method: 'GET', + pathTemplate: '/api/masked', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'direct', + }); + (getSitePolicy as ReturnType).mockReturnValueOnce({ + domainAllowlist: ['example.com'], + capabilities: [], + browserRequired: true, + }); + + await engine.executeSkill('example.com.masked_direct.v1', {}); + + expect(retryWithEscalation).toHaveBeenCalledWith( + expect.objectContaining({ id: 'example.com.masked_direct.v1' }), + {}, + expect.objectContaining({ + siteRecommendedTier: 'browser_proxied', + directAllowed: false, + }), + ); + }); + + it('persists browser_required lock and site gate when a direct challenge appears before browser fallback success', async () => { + const mockSkill = { + id: 'example.com.cf_guard.v1', + siteId: 'example.com', + name: 'cf_guard', + method: 'GET', + pathTemplate: '/api/guard', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + authType: undefined, + confidence: 0.4, + consecutiveValidations: 1, + directCanaryEligible: true, + directCanaryAttempts: 1, + validationsSinceLastCanary: 3, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'direct', + }); + (handleFailure as ReturnType).mockReturnValueOnce({ + newTier: 'tier_3', + tierLock: { type: 'permanent', reason: 'browser_required', evidence: 'cloudflare challenge' }, + reason: 'browser required', + }); + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: true, + tier: 'browser_proxied', + status: 200, + data: { ok: true }, + rawBody: '{"ok":true}', + headers: { 'content-type': 'application/json' }, + latencyMs: 40, + schemaMatch: true, + semanticPass: true, + retryDecisions: [], + stepResults: [ + { tier: 'direct', success: false, status: 403, latencyMs: 10, failureCause: 'cloudflare_challenge' }, + { tier: 'browser_proxied', success: true, status: 200, latencyMs: 30 }, + ], + }); + + const result = await engine.executeSkill('example.com.cf_guard.v1', {}); + + expect(result.success).toBe(true); + expect(handleFailure).toHaveBeenCalledWith( + expect.objectContaining({ id: 'example.com.cf_guard.v1' }), + 'cloudflare_challenge', + ); + expect(repoInstance?.updateTier).toHaveBeenCalledWith( + 'example.com.cf_guard.v1', + 'tier_3', + expect.objectContaining({ reason: 'browser_required' }), + ); + expect(repoInstance?.update).toHaveBeenCalledWith( + 'example.com.cf_guard.v1', + expect.objectContaining({ + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + }), + ); + expect(mergeSitePolicy).toHaveBeenCalledWith( + 'example.com', + { browserRequired: true }, + expect.anything(), + ); + expect(mockSiteRepoInstance.update).toHaveBeenCalledWith( + 'example.com', + expect.objectContaining({ recommendedTier: 'browser_proxied' }), + ); + }); + + it('does not persist a sticky browser_required lock for browser-tier-only challenge evidence', async () => { + const mockSkill = { + id: 'example.com.browser_only_cf.v1', + siteId: 'example.com', + name: 'browser_only_cf', + method: 'GET', + pathTemplate: '/api/browser-only', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + authType: undefined, + confidence: 0.4, + consecutiveValidations: 1, + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 1, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'browser_proxied', + }); + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: true, + tier: 'full_browser', + status: 200, + data: { ok: true }, + rawBody: '{"ok":true}', + headers: { 'content-type': 'application/json' }, + latencyMs: 55, + schemaMatch: true, + semanticPass: true, + retryDecisions: [], + stepResults: [ + { tier: 'browser_proxied', success: false, status: 403, latencyMs: 20, failureCause: 'cloudflare_challenge' }, + { tier: 'full_browser', success: true, status: 200, latencyMs: 35 }, + ], + }); + + const result = await engine.executeSkill('example.com.browser_only_cf.v1', {}); + + expect(result.success).toBe(true); + expect(handleFailure).not.toHaveBeenCalled(); + expect(repoInstance?.updateTier).not.toHaveBeenCalled(); + expect(repoInstance?.incrementValidationsSinceLastCanary).not.toHaveBeenCalled(); + expect(mergeSitePolicy).not.toHaveBeenCalled(); + }); + + it('suppresses forceDirectTier probes when site policy marks the site as browser-required', async () => { + const mockSkill = { + id: 'example.com.direct_probe.v1', + siteId: 'example.com', + name: 'direct_probe', + method: 'GET', + pathTemplate: '/api/probe', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'direct', + }); + (getSitePolicy as ReturnType).mockReturnValueOnce({ + domainAllowlist: ['example.com'], + capabilities: [], + browserRequired: true, + }); + + const result = await engine.executeSkill( + 'example.com.direct_probe.v1', + {}, + '__auto_validation__', + { forceDirectTier: true }, + ); + + expect(result.success).toBe(false); + expect(result.failureCause).toBe('policy_denied'); + expect(result.probeSuppressed).toBe(true); + expect(result.failureDetail).toContain('Direct probe suppressed'); + expect(retryWithEscalation).not.toHaveBeenCalled(); + }); + + it('bootstraps a live-chrome execution session before replay for browser_required skills', async () => { + const mockSkill = { + id: 'example.com.browser_required.v1', + siteId: 'example.com', + name: 'browser_required', + method: 'GET', + pathTemplate: '/api/price', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: { type: 'permanent', reason: 'browser_required', evidence: 'cloudflare challenge' }, + authType: undefined, + confidence: 0.4, + consecutiveValidations: 1, + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'browser_proxied', + }); + + let policyState: Record = { + domainAllowlist: ['example.com'], + capabilities: [], + browserRequired: true, + }; + (getSitePolicy as ReturnType).mockImplementation(() => policyState); + + const provider = { + evaluateFetch: vi.fn(), + getCurrentUrl: vi.fn().mockReturnValue('https://example.com/api/price'), + }; + const backendCreateProvider = vi.fn().mockResolvedValue(provider); + const getExecutionBackendSpy = vi.spyOn(engine, 'getExecutionBackend').mockReturnValue({ + createProvider: backendCreateProvider, + } as any); + const connectRecoverySessionSpy = vi.spyOn(engine as any, 'connectRecoverySession').mockResolvedValue({ + sessionName: '__recovery_exec', + managedBrowser: true, + }); + const bindRecoveryPolicySpy = vi.spyOn(engine as any, 'bindRecoveryPolicy').mockImplementation(async () => { + policyState = { + ...policyState, + domainAllowlist: ['example.com', '127.0.0.1', 'localhost'], + executionBackend: 'live-chrome', + executionSessionName: '__recovery_exec', + }; + }); + const alignRecoveryPageSpy = vi.spyOn(engine as any, 'alignRecoveryPage').mockResolvedValue(undefined); + + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: true, + tier: 'browser_proxied', + status: 200, + data: { ok: true }, + rawBody: '{"ok":true}', + headers: { 'content-type': 'application/json' }, + latencyMs: 40, + schemaMatch: true, + semanticPass: true, + retryDecisions: [], + stepResults: [ + { tier: 'browser_proxied', success: true, status: 200, latencyMs: 40 }, + ], + }); + + const result = await engine.executeSkill('example.com.browser_required.v1', {}); + + expect(result.success).toBe(true); + expect(connectRecoverySessionSpy).toHaveBeenCalledTimes(1); + expect(bindRecoveryPolicySpy).toHaveBeenCalledTimes(1); + expect(alignRecoveryPageSpy).toHaveBeenCalledTimes(1); + expect(connectRecoverySessionSpy.mock.invocationCallOrder[0]).toBeLessThan(getExecutionBackendSpy.mock.invocationCallOrder[0]); + expect(bindRecoveryPolicySpy.mock.invocationCallOrder[0]).toBeLessThan(getExecutionBackendSpy.mock.invocationCallOrder[0]); + expect(backendCreateProvider).toHaveBeenCalledWith('example.com', ['example.com', '127.0.0.1', 'localhost']); + expect(retryWithEscalation).toHaveBeenCalledWith( + expect.objectContaining({ id: 'example.com.browser_required.v1' }), + {}, + expect.objectContaining({ + browserProvider: provider, + directAllowed: false, + siteRecommendedTier: 'browser_proxied', + }), + ); + }); + + it('returns browser_handoff_required when browser execution fails behind a detected challenge', async () => { + const mockSkill = { + id: 'example.com.execute_recovery.v1', + siteId: 'example.com', + name: 'execute_recovery', + method: 'GET', + pathTemplate: '/api/recovery', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + confidence: 0.4, + consecutiveValidations: 1, + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'browser_proxied', + }); + + const provider = { + getCurrentUrl: vi.fn().mockReturnValue('https://example.com/cdn-cgi/challenge-platform'), + detectChallengePage: vi.fn().mockResolvedValue(true), + }; + vi.spyOn(engine, 'getExecutionBackend').mockReturnValue({ + createProvider: vi.fn().mockResolvedValue(provider), + } as any); + vi.spyOn(engine as any, 'connectRecoverySession').mockResolvedValue({ + sessionName: '__recovery_exec', + managedBrowser: true, + }); + vi.spyOn(engine as any, 'bindRecoveryPolicy').mockResolvedValue(undefined); + vi.spyOn(engine as any, 'alignRecoveryPage').mockResolvedValue(undefined); + + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: false, + tier: 'full_browser', + status: 0, + data: undefined, + rawBody: '', + headers: {}, + latencyMs: 50, + schemaMatch: false, + semanticPass: false, + failureCause: 'fetch_error', + failureDetail: 'Full browser execution found no matching request', + retryDecisions: [], + stepResults: [ + { tier: 'full_browser', success: false, status: 0, latencyMs: 50, failureCause: 'fetch_error' }, + ], + }); + + const result = await engine.executeSkill('example.com.execute_recovery.v1', {}); + + expect(result.status).toBe('browser_handoff_required'); + if (result.status === 'browser_handoff_required') { + expect(result.session).toBe('__recovery_exec'); + expect(result.managedBrowser).toBe(true); + expect(result.resumeToken).toBeTruthy(); + } + expect(engine.getStatus().pendingRecovery?.siteId).toBe('example.com'); + }); + it('executeSkill with skipMetrics skips metricsRepo.record', async () => { const mockSkill = { id: 'example.com.get_data.v1', @@ -1062,6 +1670,235 @@ describe('Engine', () => { await engine.executeSkill('example.com.get_data.v1', {}); expect(metricsInstance?.record).toHaveBeenCalled(); + }, 60000); + }); + + describe('auto-validation', () => { + it('skips browser-required sites without executing a direct probe', async () => { + const skill = { + id: 'example.com.auto_skip.v1', + siteId: 'example.com', + name: 'auto_skip', + method: 'GET', + pathTemplate: '/api/auto-skip', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + sampleParams: {}, + parameters: [], + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getByStatus.mockReturnValueOnce([skill]); + } + (getSitePolicy as ReturnType).mockReturnValueOnce({ + domainAllowlist: ['example.com'], + capabilities: [], + browserRequired: true, + }); + const executeSpy = vi.spyOn(engine, 'executeSkill'); + + await (engine as any).runAutoValidationCycle(); + + expect(executeSpy).not.toHaveBeenCalled(); + expect(engine.getStatus().autoValidation.skippedBrowserRequired).toBe(1); + expect(engine.getStatus().autoValidation.validated).toBe(0); + }); + }); + + describe('session sweep', () => { + it('sweeps idle recovery sessions without touching lastUsedAt via get()', async () => { + const msm = engine.getMultiSessionManager(); + const session = msm.getOrCreate('__recovery_idle'); + session.isCdp = true; + session.sessionKind = 'recovery_explore_cdp'; + session.lastUsedAt = Date.now() - (10 * 60 * 1000); + + const getSpy = vi.spyOn(msm, 'get'); + const closeSpy = vi.spyOn(msm, 'close').mockResolvedValue(undefined); + + await (engine as any).sweepIdleNamedSessions(); + + expect(getSpy).not.toHaveBeenCalled(); + expect(closeSpy).toHaveBeenCalledWith('__recovery_idle', { force: true, engineMode: 'idle' }); + }); + + it('does not sweep the active recovery explore session', async () => { + const msm = engine.getMultiSessionManager(); + const session = msm.getOrCreate('__recovery_active'); + session.isCdp = true; + session.sessionKind = 'recovery_explore_cdp'; + session.lastUsedAt = Date.now() - (10 * 60 * 1000); + (engine as any).exploreSessionName = '__recovery_active'; + + const closeSpy = vi.spyOn(msm, 'close').mockResolvedValue(undefined); + + await (engine as any).sweepIdleNamedSessions(); + + expect(closeSpy).not.toHaveBeenCalled(); + }); + + it('does not sweep recovery sessions while recording', async () => { + const msm = engine.getMultiSessionManager(); + const session = msm.getOrCreate('__recovery_recording'); + session.isCdp = true; + session.sessionKind = 'recovery_explore_cdp'; + session.lastUsedAt = Date.now() - (10 * 60 * 1000); + (engine as any).mode = 'recording'; + + const closeSpy = vi.spyOn(msm, 'close').mockResolvedValue(undefined); + + await (engine as any).sweepIdleNamedSessions(); + + expect(closeSpy).not.toHaveBeenCalled(); + }); + + it('uses the 20-minute explore recovery fallback when browser idle timeout is disabled', async () => { + const zeroIdleEngine = new Engine({ + ...makeConfig(), + browser: { idleTimeoutMs: 0 }, + } as any); + const msm = zeroIdleEngine.getMultiSessionManager(); + const closeSpy = vi.spyOn(msm, 'close').mockResolvedValue(undefined); + + try { + const younger = msm.getOrCreate('__recovery_young'); + younger.isCdp = true; + younger.sessionKind = 'recovery_explore_cdp'; + younger.lastUsedAt = Date.now() - (10 * 60 * 1000); + + await (zeroIdleEngine as any).sweepIdleNamedSessions(); + expect(closeSpy).not.toHaveBeenCalled(); + + const older = msm.getOrCreate('__recovery_old'); + older.isCdp = true; + older.sessionKind = 'recovery_explore_cdp'; + older.lastUsedAt = Date.now() - (21 * 60 * 1000); + + await (zeroIdleEngine as any).sweepIdleNamedSessions(); + expect(closeSpy).toHaveBeenCalledWith('__recovery_old', { force: true, engineMode: 'idle' }); + expect((zeroIdleEngine as any).getSessionSweepIntervalMs()).toBe(30_000); + } finally { + await zeroIdleEngine.close(); + } + }); + + it('sweeps execute recovery sessions on a short lease and hides them from session listings', async () => { + const msm = engine.getMultiSessionManager(); + const session = msm.getOrCreate('__recovery_execute'); + session.isCdp = true; + session.sessionKind = 'recovery_execute_cdp'; + session.lastUsedAt = Date.now() - 70_000; + + expect(msm.list(undefined, undefined, { includeInternal: false }).map((entry) => entry.name)).not.toContain('__recovery_execute'); + + const closeSpy = vi.spyOn(msm, 'close').mockResolvedValue(undefined); + + await (engine as any).sweepIdleNamedSessions(); + + expect(closeSpy).toHaveBeenCalledWith('__recovery_execute', { force: true, engineMode: 'idle' }); + }); + + it('keeps execute recovery sessions alive briefly for warm reuse', async () => { + const msm = engine.getMultiSessionManager(); + const session = msm.getOrCreate('__recovery_execute_warm'); + session.isCdp = true; + session.sessionKind = 'recovery_execute_cdp'; + session.lastUsedAt = Date.now() - 10_000; + + const closeSpy = vi.spyOn(msm, 'close').mockResolvedValue(undefined); + + await (engine as any).sweepIdleNamedSessions(); + + expect(closeSpy).not.toHaveBeenCalled(); + expect((engine as any).getSessionSweepIntervalMs()).toBe(30_000); + }); + + it('promotes execute recovery sessions to visible explore recovery on handoff', async () => { + const mockSkill = { + id: 'example.com.execute_handoff.v1', + siteId: 'example.com', + name: 'execute_handoff', + method: 'GET', + pathTemplate: '/api/recovery', + sideEffectClass: 'read-only', + allowedDomains: ['example.com'], + currentTier: 'tier_3', + tierLock: null, + confidence: 0.4, + consecutiveValidations: 1, + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + }; + const repoInstance = (SkillRepository as unknown as ReturnType).mock.results[0]?.value; + if (repoInstance) { + repoInstance.getById.mockReturnValueOnce(mockSkill); + } + mockSiteRepoInstance.getById.mockReturnValueOnce({ + siteId: 'example.com', + recommendedTier: 'browser_proxied', + }); + + let policyState: Record = { + domainAllowlist: ['example.com'], + capabilities: [], + browserRequired: true, + executionBackend: 'live-chrome', + executionSessionName: '__recovery_exec', + }; + (getSitePolicy as ReturnType).mockImplementation(() => policyState); + + const msm = engine.getMultiSessionManager(); + const session = msm.getOrCreate('__recovery_exec'); + session.isCdp = true; + session.siteId = 'example.com'; + session.sessionKind = 'recovery_execute_cdp'; + session.managedPid = 4242; + session.managedProfileDir = '/tmp/recovery'; + (session.browserManager as any).getBrowser = vi.fn().mockReturnValue({ isConnected: () => true }); + + const provider = { + getCurrentUrl: vi.fn().mockReturnValue('https://example.com/cdn-cgi/challenge-platform'), + detectChallengePage: vi.fn().mockResolvedValue(true), + }; + vi.spyOn(engine, 'getExecutionBackend').mockReturnValue({ + createProvider: vi.fn().mockResolvedValue(provider), + } as any); + + (retryWithEscalation as ReturnType).mockResolvedValueOnce({ + success: false, + tier: 'browser_proxied', + status: 0, + data: undefined, + rawBody: '', + headers: {}, + latencyMs: 50, + schemaMatch: false, + semanticPass: false, + failureCause: 'fetch_error', + failureDetail: 'blocked by challenge', + retryDecisions: [], + stepResults: [ + { tier: 'browser_proxied', success: false, status: 0, latencyMs: 50, failureCause: 'fetch_error' }, + ], + }); + + const result = await engine.executeSkill('example.com.execute_handoff.v1', {}); + + expect(result.status).toBe('browser_handoff_required'); + expect(msm.peek('__recovery_exec')?.sessionKind).toBe('recovery_explore_cdp'); + expect(msm.list(undefined, undefined, { includeInternal: false }).map((entry) => entry.name)).toContain('__recovery_exec'); + }); + }); + + describe('exit cleanup', () => { + it('uses sync browser cleanup helpers from the exit handler', () => { + (engine as any).exitCleanupHandler(); + + expect(mockCleanupManagedChromeLaunchesSync).toHaveBeenCalledWith(expect.anything()); + expect(mockCleanupOwnedBrowserLaunchesSync).toHaveBeenCalledWith(expect.anything()); }); }); @@ -1245,7 +2082,7 @@ describe('Engine', () => { ); await engine.executeSkill('example.com.get_data.v1', {}); expect(inferSchema).toHaveBeenCalledWith([{ id: 1, name: 'test' }]); - }); + }, 60000); it('accumulates schema when effectiveValidations < 3', async () => { setupSkillExecution( @@ -1435,7 +2272,7 @@ describe('Engine', () => { await engine.executeSkill('example.com.get_data.v1', {}); expect(createEvent).toHaveBeenCalledWith('skill_demoted', 'example.com.get_data.v1', 'example.com', expect.objectContaining({ reason: 'schema_drift', changes: 1 })); - }); + }, 60000); it('sends skill_degraded notification with successRate and trend', async () => { (monitorSkills as ReturnType).mockReturnValueOnce([ diff --git a/tests/unit/executor.test.ts b/tests/unit/executor.test.ts index ecdb4e5..8a4df84 100644 --- a/tests/unit/executor.test.ts +++ b/tests/unit/executor.test.ts @@ -110,6 +110,31 @@ describe('executor', () => { }); }); + describe('tier routing', () => { + it('starts browser_required-locked skills at browser_proxied', async () => { + const skill = makeSkill({ + currentTier: TierState.TIER_3_DEFAULT, + tierLock: { type: 'permanent', reason: 'browser_required', evidence: 'challenge detected' }, + }); + const browserProvider = { + navigate: vi.fn().mockResolvedValue(undefined), + networkRequests: vi.fn().mockResolvedValue([]), + evaluateFetch: vi.fn().mockResolvedValue({ + status: 200, + headers: { 'content-type': 'application/json' }, + body: '{"data":"ok"}', + }), + } as any; + + const result = await executeSkill(skill, {}, { browserProvider }); + + expect(result.success).toBe(true); + expect(result.tier).toBe(ExecutionTier.BROWSER_PROXIED); + expect(browserProvider.evaluateFetch).toHaveBeenCalledTimes(1); + expect(browserProvider.navigate).not.toHaveBeenCalled(); + }); + }); + describe('failure classification', () => { it('classifies 429 as rate_limited', async () => { const skill = makeSkill(); @@ -157,6 +182,70 @@ describe('executor', () => { expect(result.failureCause).toBe(FailureCause.UNKNOWN); }); + it('classifies cloudflare_challenge from cf-mitigated header before auth handling', async () => { + const skill = makeSkill({ authType: 'bearer' }); + const result = await executeSkill(skill, {}, { + fetchFn: mockFetch({ + status: 403, + headers: { + 'cf-mitigated': 'challenge', + 'content-type': 'text/html', + }, + body: 'Verifying you are human', + }), + }); + expect(result.success).toBe(false); + expect(result.failureCause).toBe(FailureCause.CLOUDFLARE_CHALLENGE); + }); + + it('classifies cloudflare_challenge from challenge body before 5xx handling', async () => { + const skill = makeSkill(); + const result = await executeSkill(skill, {}, { + fetchFn: mockFetch({ + status: 503, + headers: { + 'server': 'cloudflare', + 'content-type': 'text/html', + }, + body: 'Just a momentChecking your browser', + }), + }); + expect(result.success).toBe(false); + expect(result.failureCause).toBe(FailureCause.CLOUDFLARE_CHALLENGE); + }); + + it('does not classify server-cloudflare alone as cloudflare_challenge', async () => { + const skill = makeSkill(); + const result = await executeSkill(skill, {}, { + fetchFn: mockFetch({ + status: 403, + headers: { + 'server': 'cloudflare', + 'cf-ray': 'abc123', + 'content-type': 'text/plain', + }, + body: 'Forbidden', + }), + }); + expect(result.success).toBe(false); + expect(result.failureCause).toBe(FailureCause.COOKIE_REFRESH); + }); + + it('does not classify generic interstitial text without Cloudflare corroboration', async () => { + const skill = makeSkill(); + const result = await executeSkill(skill, {}, { + fetchFn: mockFetch({ + status: 403, + headers: { + 'content-type': 'text/html', + }, + body: 'Just a momentChecking your browser', + }), + }); + expect(result.success).toBe(false); + expect(result.failureCause).toBe(FailureCause.COOKIE_REFRESH); + }); + it('classifies js_computed_field with permanent lock', async () => { const skill = makeSkill({ currentTier: 'tier_3', diff --git a/tests/unit/export-codegen.test.ts b/tests/unit/export-codegen.test.ts new file mode 100644 index 0000000..e3ffe82 --- /dev/null +++ b/tests/unit/export-codegen.test.ts @@ -0,0 +1,116 @@ +import { describe, expect, it } from 'vitest'; +import { generateExport, generateSkillTemplates } from '../../src/skill/generator.js'; +import type { SkillSpec } from '../../src/skill/types.js'; + +function makeSkill(overrides: Partial = {}): SkillSpec { + return { + id: 'example.com.get_user.v1', + version: 1, + status: 'active', + currentTier: 'tier_1', + tierLock: null, + allowedDomains: ['example.com'], + requiredCapabilities: [], + parameters: [], + validation: { semanticChecks: [], customInvariants: [] }, + redaction: { piiClassesFound: [], fieldsRedacted: 0 }, + replayStrategy: 'prefer_tier_1', + sideEffectClass: 'read-only', + sampleCount: 1, + consecutiveValidations: 1, + confidence: 1, + method: 'GET', + pathTemplate: '/users/{id}', + inputSchema: { + type: 'object', + properties: { + id: { type: 'string' }, + q: { type: 'string' }, + }, + }, + isComposite: false, + siteId: 'example.com', + name: 'get_user', + description: 'Fetch a user', + successRate: 1, + createdAt: Date.now(), + updatedAt: Date.now(), + ...overrides, + } as SkillSpec; +} + +describe('export codegen', () => { + it('generates curl exports with resolved params and transform comments', () => { + const code = generateExport(makeSkill({ + outputTransform: { type: 'jsonpath', expression: '$.user.id', label: 'user_id' }, + }), 'curl', { id: '123', q: 'neo' }); + + expect(code).toContain('# Transform: jsonpath $.user.id -> user_id'); + expect(code).toContain("curl -X GET"); + expect(code).toContain("https://example.com/users/123?q=neo"); + }); + + it('generates fetch.ts exports', () => { + const code = generateExport(makeSkill(), 'fetch.ts', { id: '123' }); + expect(code).toContain("const response = await fetch("); + expect(code).toContain('"https://example.com/users/123"'); + expect(code).toContain('method: "GET"'); + }); + + it('generates requests.py exports', () => { + const code = generateExport(makeSkill({ method: 'POST' }), 'requests.py', { id: '123', q: 'neo' }); + expect(code).toContain('import requests'); + expect(code).toContain('requests.request("POST"'); + expect(code).toContain('data = "{\\"q\\":\\"neo\\"}"'); + }); + + it('always replaces captured auth headers with placeholders in exports', () => { + const code = generateExport(makeSkill({ + authType: 'bearer', + requiredHeaders: { + authorization: 'Bearer real-secret-token', + cookie: 'session=real-cookie', + }, + }), 'curl', { id: '123' }); + + expect(code).toContain("'Authorization: Bearer YOUR_TOKEN'"); + expect(code).toContain("'Cookie: SESSION=YOUR_COOKIE'"); + expect(code).not.toContain('real-secret-token'); + expect(code).not.toContain('real-cookie'); + }); + + it('preserves string literals when generating python exports', () => { + const code = generateExport(makeSkill({ + method: 'POST', + requiredHeaders: { + accept: 'true', + }, + }), 'requests.py', { id: '123', q: 'neo' }); + + expect(code).toContain('"Accept": "true"'); + expect(code).not.toContain('"Accept": True'); + }); + + it('generates playwright exports with browser_required warning', () => { + const code = generateExport(makeSkill({ + tierLock: { + type: 'permanent', + reason: 'browser_required', + evidence: 'challenge', + }, + }), 'playwright.ts', { id: '123' }); + + expect(code).toContain('Warning: this skill is marked browser_required'); + expect(code).toContain("import { chromium } from 'playwright';"); + expect(code).toContain('page.request.fetch'); + }); + + it('extends generated templates with standalone export files', () => { + const templates = generateSkillTemplates(makeSkill()); + expect(templates.has('request.json')).toBe(true); + expect(templates.has('curl.sh')).toBe(true); + expect(templates.has('fetch.ts')).toBe(true); + expect(templates.has('requests.py')).toBe(true); + expect(templates.has('playwright.ts')).toBe(true); + }); +}); diff --git a/tests/unit/generator.test.ts b/tests/unit/generator.test.ts index 71aa5af..b225450 100644 --- a/tests/unit/generator.test.ts +++ b/tests/unit/generator.test.ts @@ -96,6 +96,15 @@ describe('generator', () => { expect(spec.isComposite).toBe(true); expect(spec.chainSpec).toBe(chain); }); + + it('preserves html response content type and keeps html skills read-only', () => { + const spec = generateSkill('example.com', makeCluster({ + responseContentType: 'text/html; charset=utf-8', + })); + + expect(spec.responseContentType).toBe('text/html; charset=utf-8'); + expect(spec.sideEffectClass).toBe('read-only'); + }); }); describe('generateSkillMd', () => { diff --git a/tests/unit/import-service.test.ts b/tests/unit/import-service.test.ts new file mode 100644 index 0000000..2cbb15f --- /dev/null +++ b/tests/unit/import-service.test.ts @@ -0,0 +1,404 @@ +import { describe, it, expect, vi, beforeEach } from 'vitest'; +import * as fs from 'node:fs'; + +const { mockSetSitePolicy, mockValidateAndNormalizeImportablePolicy } = vi.hoisted(() => ({ + mockSetSitePolicy: vi.fn().mockReturnValue({ persisted: true }), + mockValidateAndNormalizeImportablePolicy: vi.fn().mockImplementation((policy: any, siteId: string) => ({ + valid: true, + errors: [], + value: { ...policy, siteId, browserRequired: policy?.browserRequired === true }, + })), +})); + +vi.mock('node:fs'); + +vi.mock('../../src/storage/import-validator.js', () => ({ + validateImportableSkill: vi.fn().mockReturnValue({ valid: true, errors: [] }), + validateImportableSite: vi.fn().mockReturnValue({ valid: true, errors: [] }), + validateAndNormalizeImportablePolicy: mockValidateAndNormalizeImportablePolicy, +})); + +vi.mock('../../src/core/policy.js', () => ({ + getSitePolicy: vi.fn().mockReturnValue({ + siteId: 'example.com', + maxConcurrent: 2, + maxQps: 5, + readOnlyDefault: false, + allowedMethods: ['GET', 'POST'], + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + capabilities: [], + }), + setSitePolicy: mockSetSitePolicy, +})); + +import { performImport } from '../../src/app/import-service.js'; +import { validateImportableSkill, validateImportableSite } from '../../src/storage/import-validator.js'; +import { getSitePolicy } from '../../src/core/policy.js'; + +function makeBundle(overrides: Record = {}) { + return { + version: '0.2.0', + site: { + id: 'example.com', + displayName: 'Example', + rootUrls: ['https://example.com'], + firstSeen: Date.now(), + lastVisited: Date.now(), + masteryLevel: 'novice', + ...((overrides.site as Record) ?? {}), + }, + skills: overrides.skills ?? [ + { + id: 'example.com.get_data.v1', + name: 'get_data', + siteId: 'example.com', + method: 'GET', + pathTemplate: '/api/data', + allowedDomains: ['example.com'], + status: 'active', + currentTier: 'tier_3', + inputSchema: {}, + sideEffectClass: 'read-only', + confidence: 0.9, + consecutiveValidations: 3, + sampleCount: 10, + successRate: 0.95, + version: 1, + isComposite: false, + directCanaryEligible: false, + directCanaryAttempts: 0, + validationsSinceLastCanary: 0, + createdAt: Date.now(), + updatedAt: Date.now(), + }, + ], + ...overrides, + }; +} + +function makeDeps() { + return { + db: { + transaction: vi.fn((fn: () => void) => fn()), + } as any, + skillRepo: { + getById: vi.fn().mockReturnValue(undefined), + create: vi.fn(), + update: vi.fn(), + delete: vi.fn(), + } as any, + siteRepo: { + getById: vi.fn().mockReturnValue(undefined), + create: vi.fn(), + update: vi.fn(), + delete: vi.fn(), + } as any, + config: { logLevel: 'silent' }, + }; +} + +describe('performImport', () => { + beforeEach(() => { + vi.restoreAllMocks(); + mockSetSitePolicy.mockReturnValue({ persisted: true }); + mockValidateAndNormalizeImportablePolicy.mockImplementation((policy: any, siteId: string) => ({ + valid: true, + errors: [], + value: { ...policy, siteId, browserRequired: policy?.browserRequired === true }, + })); + (validateImportableSkill as any).mockReturnValue({ valid: true, errors: [] }); + (validateImportableSite as any).mockReturnValue({ valid: true, errors: [] }); + (getSitePolicy as any).mockReturnValue({ + siteId: 'example.com', + maxConcurrent: 2, + maxQps: 5, + readOnlyDefault: false, + allowedMethods: ['GET', 'POST'], + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + capabilities: [], + }); + }); + + it('creates site and skills for a new import', async () => { + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + const result = await performImport('test.json', deps, { yes: true }); + + expect(result.siteAction).toBe('created'); + expect(result.created).toBe(1); + expect(result.updated).toBe(0); + expect(deps.siteRepo.create).toHaveBeenCalledOnce(); + expect(deps.skillRepo.create).toHaveBeenCalledOnce(); + }); + + it('updates existing site and skills', async () => { + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + deps.siteRepo.getById.mockReturnValue(bundle.site); + deps.skillRepo.getById.mockReturnValue(bundle.skills[0]); + + const result = await performImport('test.json', deps, { yes: true }); + + expect(result.siteAction).toBe('updated'); + expect(result.created).toBe(0); + expect(result.updated).toBe(1); + }); + + it('skips invalid skills', async () => { + const bundle = makeBundle({ + skills: [ + { id: 'good', siteId: 'example.com', method: 'GET', pathTemplate: '/ok', allowedDomains: ['example.com'] }, + { id: 'bad', siteId: 'example.com', method: 'GET', pathTemplate: '/bad', allowedDomains: ['example.com'] }, + ], + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + (validateImportableSkill as any) + .mockReturnValueOnce({ valid: true, errors: [] }) + .mockReturnValueOnce({ valid: false, errors: ['missing field'] }); + + const deps = makeDeps(); + const result = await performImport('test.json', deps, { yes: true }); + + expect(result.created).toBe(1); + expect(result.skipped).toBe(1); + }); + + it('fills defaults for skills missing NOT NULL fields', async () => { + const bundle = makeBundle({ + skills: [ + { + id: 'example.com.minimal.v1', + siteId: 'example.com', + method: 'GET', + pathTemplate: '/api/minimal', + allowedDomains: ['example.com'], + }, + ], + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + const result = await performImport('test.json', deps, { yes: true }); + + expect(result.created).toBe(1); + const created = deps.skillRepo.create.mock.calls[0][0]; + expect(created.status).toBe('draft'); + expect(created.currentTier).toBe('tier_3'); + expect(created.confidence).toBe(0); + expect(created.name).toBe('minimal'); + }); + + it('handles corrupt site rows gracefully', async () => { + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + deps.siteRepo.getById.mockImplementation(() => { throw new Error('corrupt'); }); + + const result = await performImport('test.json', deps, { yes: true }); + expect(result.siteAction).toBe('created'); + // Corrupt site delete attempted before create + expect(deps.siteRepo.delete).toHaveBeenCalledWith('example.com'); + expect(deps.siteRepo.create).toHaveBeenCalledOnce(); + }); + + it('handles corrupt skill rows — counts as updated', async () => { + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + deps.skillRepo.getById.mockImplementation(() => { throw new Error('corrupt'); }); + + const result = await performImport('test.json', deps, { yes: true }); + expect(result.updated).toBe(1); + expect(result.created).toBe(0); + // Corrupt skill deleted then recreated + expect(deps.skillRepo.delete).toHaveBeenCalledWith('example.com.get_data.v1'); + expect(deps.skillRepo.create).toHaveBeenCalledOnce(); + }); + + it('rejects in non-TTY without --yes when overwrites exist', async () => { + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + // Make it look like overwrite (existing site) + const deps = makeDeps(); + deps.siteRepo.getById.mockReturnValue(bundle.site); + + const origIsTTY = process.stdin.isTTY; + Object.defineProperty(process.stdin, 'isTTY', { value: false, configurable: true }); + + try { + await expect(performImport('test.json', deps)).rejects.toThrow('Non-interactive terminal'); + } finally { + Object.defineProperty(process.stdin, 'isTTY', { value: origIsTTY, configurable: true }); + } + }); + + it('throws for missing file', async () => { + (fs.existsSync as any).mockReturnValue(false); + const deps = makeDeps(); + await expect(performImport('missing.json', deps, { yes: true })).rejects.toThrow("File 'missing.json' not found"); + }); + + it('throws for invalid bundle format', async () => { + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify({ version: '0.2.0' })); + (validateImportableSite as any).mockReturnValue({ valid: true, errors: [] }); + + const deps = makeDeps(); + await expect(performImport('bad.json', deps, { yes: true })).rejects.toThrow('Invalid bundle format'); + }); + + it('detects authType and reports hasAuthSkills', async () => { + const bundle = makeBundle({ + skills: [ + { + id: 'example.com.auth_api.v1', + siteId: 'example.com', + method: 'GET', + pathTemplate: '/api/auth', + allowedDomains: ['example.com'], + authType: 'bearer', + }, + ], + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + const result = await performImport('test.json', deps, { yes: true }); + expect(result.hasAuthSkills).toBe(true); + }); + + it('wraps site + skill writes in a transaction', async () => { + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + await performImport('test.json', deps, { yes: true }); + + expect(deps.db.transaction).toHaveBeenCalledOnce(); + }); + + it('persists policy via setSitePolicy when bundle has policy', async () => { + const bundle = makeBundle({ + policy: { + siteId: 'example.com', + maxConcurrent: 5, + maxQps: 10, + readOnlyDefault: true, + allowedMethods: ['GET'], + requireConfirmation: [], + domainAllowlist: ['example.com'], + redactionRules: [], + capabilities: [], + }, + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + await performImport('test.json', deps, { yes: true }); + + expect(mockSetSitePolicy).toHaveBeenCalledOnce(); + expect(mockSetSitePolicy.mock.calls[0][0].siteId).toBe('example.com'); + }); + + it('warns when setSitePolicy throws instead of failing import', async () => { + const bundle = makeBundle({ + policy: { + siteId: 'example.com', + maxConcurrent: 5, + executionSessionName: 'test', + executionBackend: 'invalid', + }, + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + mockSetSitePolicy.mockImplementation(() => { throw new Error('validation failed'); }); + + const deps = makeDeps(); + const consoleSpy = vi.spyOn(console, 'error').mockImplementation(() => {}); + const result = await performImport('test.json', deps, { yes: true }); + + // Import still succeeds — policy failure is a warning + expect(result.created).toBe(1); + expect(consoleSpy).toHaveBeenCalledWith(expect.stringContaining('policy import failed')); + consoleSpy.mockRestore(); + }); + + it('normalizes policy siteId to the bundle site during import', async () => { + const bundle = makeBundle({ + policy: { siteId: 'other.com', maxConcurrent: 5 }, + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + await performImport('test.json', deps, { yes: true }); + + expect(mockSetSitePolicy).toHaveBeenCalledOnce(); + expect(mockSetSitePolicy.mock.calls[0][0].siteId).toBe('example.com'); + }); + + it('defaults browserRequired to false for legacy policy bundles missing the field', async () => { + const bundle = makeBundle({ + policy: { + siteId: 'example.com', + maxConcurrent: 5, + maxQps: 10, + readOnlyDefault: true, + allowedMethods: ['GET'], + requireConfirmation: [], + domainAllowlist: ['example.com'], + redactionRules: [], + capabilities: [], + }, + }); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + await performImport('test.json', deps, { yes: true }); + + expect(mockSetSitePolicy).toHaveBeenCalledOnce(); + expect(mockSetSitePolicy.mock.calls[0][0]).toEqual( + expect.objectContaining({ + siteId: 'example.com', + browserRequired: false, + }), + ); + }); + + it('new import with no overwrites skips confirmation even without --yes', async () => { + // No existing site or skills → no confirmation required → import proceeds + const bundle = makeBundle(); + (fs.existsSync as any).mockReturnValue(true); + (fs.readFileSync as any).mockReturnValue(JSON.stringify(bundle)); + + const deps = makeDeps(); + // No existing site/skills (defaults), so no confirmation prompt + const result = await performImport('test.json', deps); + expect(result.cancelled).toBeUndefined(); + expect(result.created).toBe(1); + expect(deps.db.transaction).toHaveBeenCalledOnce(); + }); +}); diff --git a/tests/unit/import-validator.test.ts b/tests/unit/import-validator.test.ts index a220b61..3dc8f32 100644 --- a/tests/unit/import-validator.test.ts +++ b/tests/unit/import-validator.test.ts @@ -1,5 +1,9 @@ import { describe, it, expect } from 'vitest'; -import { validateImportableSkill, validateImportableSite } from '../../src/storage/import-validator.js'; +import { + validateImportableSkill, + validateImportableSite, + validateAndNormalizeImportablePolicy, +} from '../../src/storage/import-validator.js'; describe('validateImportableSkill', () => { function validSkill() { @@ -205,3 +209,45 @@ describe('validateImportableSite', () => { expect(result.errors[0]).toContain('totalRequests must be a finite number'); }); }); + +describe('validateAndNormalizeImportablePolicy', () => { + it('fills defaults and strips unknown fields from imported policies', () => { + const result = validateAndNormalizeImportablePolicy({ + executionBackend: 'playwright', + executionSessionName: 'shared-session', + unknownField: 'ignored', + }, 'example.com'); + + expect(result.valid).toBe(true); + expect(result.errors).toHaveLength(0); + expect(result.value).toMatchObject({ + siteId: 'example.com', + allowedMethods: ['GET', 'HEAD'], + maxQps: 10, + maxConcurrent: 3, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + browserRequired: false, + executionBackend: 'playwright', + executionSessionName: 'shared-session', + }); + expect(result.value).not.toHaveProperty('unknownField'); + }); + + it('rejects malformed policy fields instead of spreading them through', () => { + const result = validateAndNormalizeImportablePolicy({ + allowedMethods: ['GET', 42], + domainAllowlist: 'example.com', + redactionRules: [true], + capabilities: ['not.a.real.capability'], + }, 'example.com'); + + expect(result.valid).toBe(false); + expect(result.errors).toContain('allowedMethods[1] must be a string'); + expect(result.errors).toContain('domainAllowlist must be an array'); + expect(result.errors).toContain('redactionRules[0] must be a string'); + expect(result.errors).toContain('capabilities[0] has invalid capability "not.a.real.capability"'); + }); +}); diff --git a/tests/unit/multi-session.test.ts b/tests/unit/multi-session.test.ts index 42aca68..bf266ab 100644 --- a/tests/unit/multi-session.test.ts +++ b/tests/unit/multi-session.test.ts @@ -208,6 +208,39 @@ describe('MultiSessionManager', () => { expect(session.name).toBe('electron'); expect(session.siteId).toBe('cdp-electron'); expect(session.isCdp).toBe(true); + expect(session.sessionKind).toBe('manual_cdp'); + }); + + it('connectCDP stores recovery_execute_cdp session kind for execute-owned sessions', async () => { + const mockBrowser = createMockBrowser(); + (connectViaCDP as ReturnType).mockResolvedValue(mockBrowser); + + const session = await msm.connectCDP( + '__recovery_deadbeef', + { port: 9222 }, + 'example.com', + undefined, + 'recovery_execute_cdp', + ); + + expect(session.sessionKind).toBe('recovery_execute_cdp'); + expect(msm.list().find((entry) => entry.name === '__recovery_deadbeef')?.sessionKind).toBe('recovery_execute_cdp'); + }); + + it('hides execute-owned recovery sessions when includeInternal is false', async () => { + const mockBrowser = createMockBrowser(); + (connectViaCDP as ReturnType).mockResolvedValue(mockBrowser); + + await msm.connectCDP( + '__recovery_deadbeef', + { port: 9222 }, + 'example.com', + undefined, + 'recovery_execute_cdp', + ); + + expect(msm.list(undefined, undefined, { includeInternal: false }).map((entry) => entry.name)).not.toContain('__recovery_deadbeef'); + expect(msm.list(undefined, undefined, { includeInternal: true }).map((entry) => entry.name)).toContain('__recovery_deadbeef'); }); it('connectCDP("default") is always rejected', async () => { diff --git a/tests/unit/noise-filter.test.ts b/tests/unit/noise-filter.test.ts index 811f0c3..578bbda 100644 --- a/tests/unit/noise-filter.test.ts +++ b/tests/unit/noise-filter.test.ts @@ -16,6 +16,7 @@ function makeEntry(overrides: Partial<{ responseStatus: number; responseBodySize: number; startedDateTime: string; + resourceType: string; }>): HarEntry { const { method = 'GET', @@ -26,6 +27,7 @@ function makeEntry(overrides: Partial<{ responseStatus = 200, responseBodySize = 100, startedDateTime = '2025-01-01T00:00:00Z', + resourceType, } = overrides; return { @@ -52,6 +54,7 @@ function makeEntry(overrides: Partial<{ bodySize: responseBodySize, }, timings: { send: 0, wait: 50, receive: 50 }, + ...(resourceType ? { _resourceType: resourceType } : {}), }; } @@ -175,6 +178,50 @@ describe('noise-filter', () => { }); }); + describe('html document classification', () => { + it('classifies same-site GET html documents as html_document', () => { + const result = filterRequests([ + makeEntry({ + url: 'https://news.example.com/front-page', + responseContentType: 'text/html; charset=utf-8', + resourceType: 'document', + }), + ], [], 'www.example.com'); + + expect(result.htmlDocument).toHaveLength(1); + expect(result.signal).toHaveLength(0); + expect(result.ambiguous).toHaveLength(0); + }); + + it('classifies POST html responses as ambiguous', () => { + const result = filterRequests([ + makeEntry({ + method: 'POST', + url: 'https://www.example.com/search', + responseContentType: 'text/html', + resourceType: 'document', + }), + ], [], 'www.example.com'); + + expect(result.htmlDocument).toHaveLength(0); + expect(result.ambiguous).toHaveLength(1); + }); + + it('classifies PUT html responses as ambiguous', () => { + const result = filterRequests([ + makeEntry({ + method: 'PUT', + url: 'https://www.example.com/profile', + responseContentType: 'text/html', + resourceType: 'document', + }), + ], [], 'www.example.com'); + + expect(result.htmlDocument).toHaveLength(0); + expect(result.ambiguous).toHaveLength(1); + }); + }); + describe('feature flag domains', () => { it('filters launchdarkly.com', () => { const result = filterRequests([makeEntry({ url: 'https://app.launchdarkly.com/sdk/eval' })]); diff --git a/tests/unit/policy.test.ts b/tests/unit/policy.test.ts index 8eaea51..a61c20d 100644 --- a/tests/unit/policy.test.ts +++ b/tests/unit/policy.test.ts @@ -28,6 +28,7 @@ import { resolveAndValidate, setSitePolicy, getSitePolicy, + mergeSitePolicy, invalidatePolicyCache, } from '../../src/core/policy.js'; import { @@ -402,6 +403,104 @@ describe('policy', () => { // ─── Execution backend policy fields ──────────────────────────── describe('executionBackend and executionSessionName', () => { + it('defaults browserRequired to false when not set', () => { + const loaded = getSitePolicy('browser-required-default'); + expect(loaded.browserRequired).toBe(false); + expect(loaded.minGapMs).toBe(100); + }); + + it('persists minGapMs when configured', () => { + const policy: SitePolicy = { + siteId: 'min-gap-site', + allowedMethods: ['GET'], + maxQps: 10, + maxConcurrent: 3, + minGapMs: 250, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: ['example.com'], + redactionRules: [], + capabilities: [], + }; + setSitePolicy(policy); + const loaded = getSitePolicy('min-gap-site'); + expect(loaded.minGapMs).toBe(250); + }); + + it('persists browserRequired when enabled', () => { + const policy: SitePolicy = { + siteId: 'browser-required-enabled', + allowedMethods: ['GET'], + maxQps: 10, + maxConcurrent: 3, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: ['example.com'], + redactionRules: [], + capabilities: [], + browserRequired: true, + }; + setSitePolicy(policy); + const loaded = getSitePolicy('browser-required-enabled'); + expect(loaded.browserRequired).toBe(true); + }); + + it('preserves browserRequired across merge overlays unless explicitly changed', () => { + const siteId = 'browser-required-merge'; + setSitePolicy({ + siteId, + allowedMethods: ['GET'], + maxQps: 10, + maxConcurrent: 3, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: ['example.com'], + redactionRules: [], + capabilities: [], + browserRequired: true, + }); + + mergeSitePolicy(siteId, { + executionBackend: 'live-chrome', + executionSessionName: '__recovery_deadbeef', + }); + + const loaded = getSitePolicy(siteId); + expect(loaded.browserRequired).toBe(true); + expect(loaded.executionBackend).toBe('live-chrome'); + expect(loaded.executionSessionName).toBe('__recovery_deadbeef'); + }); + + it('throws when browserRequired is not boolean', () => { + expect(() => setSitePolicy({ + siteId: 'invalid-browser-required', + allowedMethods: ['GET'], + maxQps: 10, + maxConcurrent: 3, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + capabilities: [], + browserRequired: 'yes' as unknown as boolean, + })).toThrow(/browserRequired must be boolean/); + }); + + it('throws when minGapMs is negative', () => { + expect(() => setSitePolicy({ + siteId: 'invalid-min-gap', + allowedMethods: ['GET'], + maxQps: 10, + maxConcurrent: 3, + minGapMs: -1, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + capabilities: [], + })).toThrow(/minGapMs must be a finite number >= 0/); + }); + it('persists executionBackend on SitePolicy', () => { const policy: SitePolicy = { siteId: 'exec-backend-site', @@ -512,5 +611,20 @@ describe('policy', () => { /executionSessionName requires executionBackend='playwright'/, ); }); + + it('throws when executionBackend is invalid', () => { + expect(() => setSitePolicy({ + siteId: 'invalid-exec-backend', + allowedMethods: ['GET'], + maxQps: 10, + maxConcurrent: 3, + readOnlyDefault: true, + requireConfirmation: [], + domainAllowlist: [], + redactionRules: [], + capabilities: [], + executionBackend: 'firefox' as any, + })).toThrow(/executionBackend must be one of/); + }); }); }); diff --git a/tests/unit/rate-limiter.test.ts b/tests/unit/rate-limiter.test.ts index 16630ba..e0ada11 100644 --- a/tests/unit/rate-limiter.test.ts +++ b/tests/unit/rate-limiter.test.ts @@ -60,6 +60,53 @@ describe('RateLimiter', () => { const result = limiter.checkRate('site-b'); expect(result.allowed).toBe(true); }); + + it('enforces a per-site minimum gap between permits when configured', () => { + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-01-01T00:00:00.000Z')); + + try { + const first = limiter.checkRate('gap-site', undefined, { minGapMs: 100 }); + const second = limiter.checkRate('gap-site', undefined, { minGapMs: 100 }); + + expect(first.allowed).toBe(true); + expect(second.allowed).toBe(false); + expect(second.retryAfterMs).toBe(100); + + vi.advanceTimersByTime(100); + const third = limiter.checkRate('gap-site', undefined, { minGapMs: 100 }); + expect(third.allowed).toBe(true); + } finally { + vi.useRealTimers(); + } + }); + + it('waitForPermit waits until the configured minimum gap clears', async () => { + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-01-01T00:00:00.000Z')); + + try { + expect(limiter.checkRate('wait-gap-site', undefined, { minGapMs: 100 }).allowed).toBe(true); + + let settled = false; + const permitPromise = limiter.waitForPermit('wait-gap-site', undefined, { + minGapMs: 100, + timeoutMs: 500, + }).then((result) => { + settled = true; + return result; + }); + + await vi.advanceTimersByTimeAsync(99); + expect(settled).toBe(false); + + await vi.advanceTimersByTimeAsync(1); + await expect(permitPromise).resolves.toEqual({ allowed: true }); + expect(settled).toBe(true); + } finally { + vi.useRealTimers(); + } + }); }); describe('429 response backoff', () => { diff --git a/tests/unit/real-browser-handoff.test.ts b/tests/unit/real-browser-handoff.test.ts index 1c6249d..dcc04f5 100644 --- a/tests/unit/real-browser-handoff.test.ts +++ b/tests/unit/real-browser-handoff.test.ts @@ -11,16 +11,24 @@ vi.mock('node:child_process', () => ({ execFileSync: (...args: unknown[]) => mockExecFileSync(...args), })); -import { launchManagedChrome } from '../../src/browser/real-browser-handoff.js'; +import { + cleanupOwnedBrowserLaunches, + cleanupOwnedBrowserLaunchesSync, + launchManagedChrome, + listOwnedBrowserLaunchMetadata, + writeOwnedBrowserLaunchMetadata, +} from '../../src/browser/real-browser-handoff.js'; describe('real-browser-handoff', () => { let profileDir: string; + let dataDir: string; let processAlive: boolean; let killSpy: ReturnType; beforeEach(() => { vi.useFakeTimers(); profileDir = fs.mkdtempSync(path.join(os.tmpdir(), 'schrute-handoff-')); + dataDir = fs.mkdtempSync(path.join(os.tmpdir(), 'schrute-owned-launches-')); processAlive = true; mockExecFileSync.mockImplementation((command: string) => { if (command === 'which') { @@ -52,6 +60,7 @@ describe('real-browser-handoff', () => { vi.useRealTimers(); vi.clearAllMocks(); fs.rmSync(profileDir, { recursive: true, force: true }); + fs.rmSync(dataDir, { recursive: true, force: true }); }); it('kills Chrome and removes metadata when DevToolsActivePort never appears', async () => { @@ -68,8 +77,59 @@ describe('real-browser-handoff', () => { await vi.advanceTimersByTimeAsync(6000); await launchExpectation; - expect(killSpy).toHaveBeenCalledWith(12345, 'SIGTERM'); + expect( + killSpy.mock.calls.some(([pid, signal]) => Math.abs(Number(pid)) === 12345 && signal === 'SIGTERM'), + ).toBe(true); expect(fs.existsSync(path.join(profileDir, 'chrome.pid'))).toBe(false); expect(fs.existsSync(path.join(profileDir, 'chrome.meta.json'))).toBe(false); }); + + it('does not kill owned launches when process start time cannot be revalidated', async () => { + writeOwnedBrowserLaunchMetadata({ dataDir } as any, { + pid: 12345, + createdAt: Date.now() - 5000, + }); + + mockExecFileSync.mockImplementation((command: string, args?: string[]) => { + if (command === 'ps' && args?.includes('lstart=')) { + throw new Error('process start time unavailable'); + } + if (command === 'ps' && args?.includes('command=')) { + return '/usr/bin/chrome-headless-shell\n'; + } + throw new Error(`Unexpected command: ${command}`); + }); + + await cleanupOwnedBrowserLaunches({ dataDir } as any); + + expect( + killSpy.mock.calls.some(([pid, signal]) => pid === 12345 && (signal === 'SIGTERM' || signal === 'SIGKILL')), + ).toBe(false); + expect(listOwnedBrowserLaunchMetadata({ dataDir } as any)).toEqual([]); + }); + + it('sync cleanup removes stale metadata without killing a reused pid', () => { + writeOwnedBrowserLaunchMetadata({ dataDir } as any, { + pid: 12345, + createdAt: Date.parse('2026-03-20T10:00:00.000Z'), + commandHint: 'chrome-headless-shell', + }); + + mockExecFileSync.mockImplementation((command: string, args?: string[]) => { + if (command === 'ps' && args?.includes('lstart=')) { + return 'Sat Mar 22 10:00:00 2026\n'; + } + if (command === 'ps' && args?.includes('command=')) { + return '/usr/bin/chrome-headless-shell --type=renderer\n'; + } + throw new Error(`Unexpected command: ${command}`); + }); + + cleanupOwnedBrowserLaunchesSync({ dataDir } as any); + + expect( + killSpy.mock.calls.some(([pid, signal]) => pid === 12345 && (signal === 'SIGTERM' || signal === 'SIGKILL')), + ).toBe(false); + expect(listOwnedBrowserLaunchMetadata({ dataDir } as any)).toEqual([]); + }); }); diff --git a/tests/unit/response-parser-extended.test.ts b/tests/unit/response-parser-extended.test.ts index 9e3b373..8f414cb 100644 --- a/tests/unit/response-parser-extended.test.ts +++ b/tests/unit/response-parser-extended.test.ts @@ -63,6 +63,18 @@ describe('response-parser (extended)', () => { ); expect(result.errors.some((e) => e.type === 'parse_error')).toBe(true); }); + + it('treats explicit html content as text without forcing json parsing', () => { + const skill = makeSkill({ + responseContentType: 'text/html', + }); + const result = parseResponse( + { status: 200, headers: { 'content-type': 'text/html; charset=utf-8' }, body: 'Hello' }, + skill, + ); + expect(result.data).toBe('Hello'); + expect(result.errors.some((e) => e.type === 'parse_error')).toBe(false); + }); }); describe('error signature detection in 200-range responses', () => { diff --git a/tests/unit/rest-server.test.ts b/tests/unit/rest-server.test.ts index da2ec8a..8ad4c32 100644 --- a/tests/unit/rest-server.test.ts +++ b/tests/unit/rest-server.test.ts @@ -267,4 +267,61 @@ describe('rest-server', () => { expect(body.mode).toBe('idle'); }); }); + + describe('POST with empty body', () => { + it('returns 200 when POST body is empty', async () => { + const response = await app.inject({ + method: 'POST', + url: '/api/v1/skills/test-id/activate', + headers: { 'content-type': 'application/json' }, + payload: '', + }); + // Should not fail with a parse error — 200 or 404 are both acceptable + expect(response.statusCode).not.toBe(400); + }); + + it('still parses valid JSON body', async () => { + const response = await app.inject({ + method: 'POST', + url: '/api/explore', + payload: { url: 'https://example.com' }, + }); + expect(response.statusCode).toBe(200); + }); + }); + + describe('v0 deprecation header', () => { + it('adds Deprecation header to v0 data routes', async () => { + const response = await app.inject({ + method: 'GET', + url: '/api/sites', + }); + expect(response.headers['deprecation']).toBe('true'); + expect(response.headers['link']).toBe('; rel="successor-version"'); + }); + + it('does NOT add Deprecation header to v1 routes', async () => { + const response = await app.inject({ + method: 'GET', + url: '/api/v1/sites', + }); + expect(response.headers['deprecation']).toBeUndefined(); + }); + + it('does NOT add Deprecation header to exempt routes', async () => { + const response = await app.inject({ + method: 'GET', + url: '/api/health', + }); + expect(response.headers['deprecation']).toBeUndefined(); + }); + + it('does NOT add Deprecation header to /api/openapi.json', async () => { + const response = await app.inject({ + method: 'GET', + url: '/api/openapi.json', + }); + expect(response.headers['deprecation']).toBeUndefined(); + }); + }); }); diff --git a/tests/unit/retry.test.ts b/tests/unit/retry.test.ts index 8a95c82..03b3e36 100644 --- a/tests/unit/retry.test.ts +++ b/tests/unit/retry.test.ts @@ -175,6 +175,39 @@ describe('retry', () => { expect(escalateDecision!.reason).toContain('Cookie refresh'); }); + it('escalates immediately on cloudflare_challenge without same-tier retry', async () => { + const skill = makeSkill({ sideEffectClass: 'read-only', currentTier: 'tier_1' }); + const browserProvider = { + evaluateFetch: vi.fn().mockResolvedValue({ + status: 200, + headers: { 'content-type': 'application/json' }, + body: '{"ok":true}', + }), + } as any; + + const result = await retryWithEscalation(skill, {}, { + fetchFn: async () => ({ + status: 403, + headers: { 'cf-mitigated': 'challenge' }, + body: 'Verifying you are human', + }), + browserProvider, + maxRetries: 3, + }); + + expect(result.success).toBe(true); + expect(result.tier).toBe(ExecutionTier.BROWSER_PROXIED); + expect(result.retryDecisions[0]).toMatchObject({ + action: 'escalate', + tier: ExecutionTier.BROWSER_PROXIED, + }); + expect(result.retryDecisions.filter((d) => d.action === 'retry')).toHaveLength(0); + expect(result.stepResults.map((step) => step.tier)).toEqual([ + ExecutionTier.DIRECT, + ExecutionTier.BROWSER_PROXIED, + ]); + }); + it('records retry decisions with backoff for rate limiting', async () => { let callCount = 0; const skill = makeSkill({ sideEffectClass: 'read-only' }); @@ -284,6 +317,44 @@ describe('retry', () => { expect(result.tier).toBe('browser_proxied'); }); + it('starts at BROWSER_PROXIED for browser_required-locked skill', async () => { + const skill = makeSkill({ + currentTier: 'tier_3', + tierLock: { type: 'permanent', reason: 'browser_required', evidence: 'challenge detected' }, + }); + const browserProvider = { + evaluateFetch: vi.fn().mockResolvedValue({ + status: 200, + headers: { 'content-type': 'application/json' }, + body: '{"ok":true}', + }), + navigate: vi.fn().mockResolvedValue(undefined), + networkRequests: vi.fn().mockResolvedValue([ + { + url: 'https://example.com/api/data', + method: 'GET', + status: 200, + requestHeaders: {}, + responseHeaders: { 'content-type': 'application/json' }, + responseBody: '{"ok":true}', + timing: { startTime: 0, endTime: 1, duration: 1 }, + }, + ]), + } as any; + + const result = await retryWithEscalation(skill, {}, { + fetchFn: async () => ({ status: 200, headers: { 'content-type': 'application/json' }, body: '{"ok":true}' }), + browserProvider, + siteRecommendedTier: ExecutionTier.DIRECT, + }); + + expect(result.success).toBe(true); + expect(result.tier).toBe(ExecutionTier.BROWSER_PROXIED); + expect(result.startingTier).toBe(ExecutionTier.BROWSER_PROXIED); + expect(browserProvider.evaluateFetch).toHaveBeenCalledTimes(1); + expect(browserProvider.navigate).not.toHaveBeenCalled(); + }); + it('skips DIRECT tier for temporarily demoted skill', async () => { const skill = makeSkill({ currentTier: 'tier_1', @@ -298,5 +369,37 @@ describe('retry', () => { }); expect(result.tier).toBe('browser_proxied'); }); + + it('masks direct-first startup when directAllowed is false even if site recommends direct', async () => { + const skill = makeSkill({ + currentTier: 'tier_3', + tierLock: null, + }); + const directFetch = vi.fn().mockResolvedValue({ + status: 200, + headers: { 'content-type': 'application/json' }, + body: '{"ok":true}', + }); + const browserProvider = { + evaluateFetch: vi.fn().mockResolvedValue({ + status: 200, + headers: { 'content-type': 'application/json' }, + body: '{"ok":true}', + }), + } as any; + + const result = await retryWithEscalation(skill, {}, { + fetchFn: directFetch, + browserProvider, + siteRecommendedTier: ExecutionTier.DIRECT, + directAllowed: false, + }); + + expect(result.success).toBe(true); + expect(result.tier).toBe(ExecutionTier.BROWSER_PROXIED); + expect(directFetch).not.toHaveBeenCalled(); + expect(browserProvider.evaluateFetch).toHaveBeenCalledTimes(1); + expect(result.startingTier).toBe(ExecutionTier.BROWSER_PROXIED); + }); }); }); diff --git a/tests/unit/router.test.ts b/tests/unit/router.test.ts index 82bde76..d3487f9 100644 --- a/tests/unit/router.test.ts +++ b/tests/unit/router.test.ts @@ -281,6 +281,31 @@ describe('router', () => { const result = await router.executeSkill('example.com', 'get_users', { id: '123' }); expect(result.success).toBe(true); }); + + it('returns 202 with browser_handoff_required when execution needs interactive recovery', async () => { + const skill = makeSkill({ consecutiveValidations: 3 }); + (deps.skillRepo.getBySiteId as ReturnType).mockReturnValue([skill]); + (deps.confirmation.isSkillConfirmed as ReturnType).mockReturnValue(true); + (deps.engine.executeSkill as ReturnType).mockResolvedValue({ + success: false, + status: 'browser_handoff_required', + reason: 'cloudflare_challenge', + recoveryMode: 'real_browser_cdp', + siteId: 'example.com', + url: 'https://example.com/cdn-cgi/challenge-platform', + hint: 'Cloudflare challenge detected.', + resumeToken: 'recover-token', + latencyMs: 250, + }); + + const router = createRouter(deps); + const result = await router.executeSkill('example.com', 'get_users', { id: '123' }); + + expect(result.success).toBe(false); + expect(result.statusCode).toBe(202); + expect(result.error).toContain('Cloudflare challenge'); + expect((result.data as Record).status).toBe('browser_handoff_required'); + }); }); describe('recoverExplore', () => { diff --git a/tests/unit/service.test.ts b/tests/unit/service.test.ts index c3dd931..7075d25 100644 --- a/tests/unit/service.test.ts +++ b/tests/unit/service.test.ts @@ -137,6 +137,32 @@ describe('SchruteService', () => { }); }); + describe('listSessions', () => { + it('hides internal execute recovery sessions', () => { + const list = vi.fn().mockImplementation((_callerId, _config, options) => + options?.includeInternal === false + ? [{ name: 'user-session', siteId: 'example.com', isCdp: true, sessionKind: 'manual_cdp' }] + : [ + { name: '__recovery_exec', siteId: 'example.com', isCdp: true, sessionKind: 'recovery_execute_cdp' }, + { name: 'user-session', siteId: 'example.com', isCdp: true, sessionKind: 'manual_cdp' }, + ] + ); + (deps.engine.getMultiSessionManager as ReturnType).mockReturnValue({ + list, + getActive: vi.fn().mockReturnValue('user-session'), + setActive: vi.fn(), + close: vi.fn().mockResolvedValue(undefined), + }); + + const sessions = service.listSessions(); + + expect(list).toHaveBeenCalledWith(undefined, deps.config, { includeInternal: false }); + expect(sessions).toEqual([ + { name: 'user-session', siteId: 'example.com', isCdp: true, active: true }, + ]); + }); + }); + describe('listSkills', () => { it('returns all skills when no filters', async () => { const skills = [makeSkill({ id: 'a' }), makeSkill({ id: 'b' })]; @@ -219,6 +245,32 @@ describe('SchruteService', () => { expect(deps.engine.executeSkill).not.toHaveBeenCalled(); }); + it('returns browser_handoff_required when the engine requests interactive recovery', async () => { + const skill = makeSkill(); + vi.mocked(deps.skillRepo.getById).mockReturnValue(skill); + vi.mocked(deps.confirmation.isSkillConfirmed).mockReturnValue(true); + vi.mocked(deps.engine.executeSkill).mockResolvedValue({ + success: false, + status: 'browser_handoff_required', + reason: 'cloudflare_challenge', + recoveryMode: 'real_browser_cdp', + siteId: 'example.com', + url: 'https://example.com/cdn-cgi/challenge-platform', + hint: 'Cloudflare challenge detected.', + resumeToken: 'recover-token', + latencyMs: 321, + } as any); + + const result = await service.executeSkill('skill-1', { key: 'value' }); + + expect(result.status).toBe('browser_handoff_required'); + if (result.status === 'browser_handoff_required') { + expect(result.result.resumeToken).toBe('recover-token'); + expect(result.result.reason).toBe('cloudflare_challenge'); + } + expect(deps.engine.executeSkill).toHaveBeenCalledWith('skill-1', { key: 'value' }, undefined); + }); + it('throws when skill not found', async () => { vi.mocked(deps.skillRepo.getById).mockReturnValue(null as any); diff --git a/tests/unit/skill-helpers.test.ts b/tests/unit/skill-helpers.test.ts index 0dd3720..36ad601 100644 --- a/tests/unit/skill-helpers.test.ts +++ b/tests/unit/skill-helpers.test.ts @@ -194,4 +194,21 @@ describe('searchAndProjectSkills', () => { expect(inactiveMatches).toBeDefined(); expect(inactiveMatches!.length).toBeGreaterThanOrEqual(1); }); + + it('renders humanized permanent lock reasons in promotionProgress', () => { + const skills = [ + makeSkill({ + id: 'locked.v1', + name: 'locked_api', + currentTier: 'tier_3', + tierLock: { type: 'permanent', reason: 'browser_required', evidence: 'cloudflare' }, + }), + ]; + const repo = makeSkillRepo(skills); + const bm = mockBrowserManager(true); + + const { results } = searchAndProjectSkills(repo, bm, { limit: 10 }); + + expect(results[0].promotionProgress).toBe('Locked: Browser required'); + }); }); diff --git a/tests/unit/tiering.test.ts b/tests/unit/tiering.test.ts index 5ed5bb9..2b491d8 100644 --- a/tests/unit/tiering.test.ts +++ b/tests/unit/tiering.test.ts @@ -3,6 +3,7 @@ import { checkPromotion, handleFailure, getEffectiveTier, + sanitizeSiteRecommendedTier, } from '../../src/core/tiering.js'; import { TierState, FailureCause, ExecutionTier } from '../../src/skill/types.js'; import type { @@ -260,6 +261,13 @@ describe('tiering', () => { expect((result.tierLock as PermanentTierLock).reason).toBe('signed_payload'); }); + it('maps cloudflare_challenge to browser_required', () => { + const skill = makeSkill(); + const result = handleFailure(skill, FailureCause.CLOUDFLARE_CHALLENGE); + expect(result.tierLock.type).toBe('permanent'); + expect((result.tierLock as PermanentTierLock).reason).toBe('browser_required'); + }); + it('no transition out of permanent lock', () => { const lock: PermanentTierLock = { type: 'permanent', @@ -303,4 +311,18 @@ describe('tiering', () => { expect(getEffectiveTier(skill)).toBe(TierState.TIER_1_PROMOTED); }); }); + + describe('sanitizeSiteRecommendedTier', () => { + it('keeps full_browser when browserRequired is true', () => { + expect(sanitizeSiteRecommendedTier(ExecutionTier.FULL_BROWSER, true)).toBe(ExecutionTier.FULL_BROWSER); + }); + + it('downgrades direct to browser_proxied when browserRequired is true', () => { + expect(sanitizeSiteRecommendedTier(ExecutionTier.DIRECT, true)).toBe(ExecutionTier.BROWSER_PROXIED); + }); + + it('normalizes cookie_refresh to browser_proxied when browserRequired is false', () => { + expect(sanitizeSiteRecommendedTier(ExecutionTier.COOKIE_REFRESH, false)).toBe(ExecutionTier.BROWSER_PROXIED); + }); + }); }); diff --git a/tests/unit/tool-dispatch.test.ts b/tests/unit/tool-dispatch.test.ts index b685704..ff91f56 100644 --- a/tests/unit/tool-dispatch.test.ts +++ b/tests/unit/tool-dispatch.test.ts @@ -213,6 +213,7 @@ function makeDeps(overrides: Partial = {}): ToolDispatchDeps { getExploreSessionName: vi.fn().mockReturnValue('default'), getRecordingSessionName: vi.fn().mockReturnValue(null), executeSkill: vi.fn().mockResolvedValue({ success: true, data: { result: 'ok' } }), + executeBatch: vi.fn().mockResolvedValue([{ skillId: 'example.com.get_users.v1', success: true, data: { result: 'ok' } }]), recoverExplore: vi.fn().mockResolvedValue({ status: 'ready', siteId: 'example.com', @@ -441,6 +442,8 @@ describe('tool-dispatch', () => { getByStatus: vi.fn().mockReturnValue([skill]), getById: vi.fn().mockReturnValue(skill), getBySiteId: vi.fn().mockReturnValue([skill]), + getAll: vi.fn().mockReturnValue([skill]), + getActive: vi.fn().mockReturnValue([skill]), } as any, }); @@ -456,12 +459,13 @@ describe('tool-dispatch', () => { getByStatus: vi.fn().mockReturnValue([skill]), getById: vi.fn().mockReturnValue(skill), getBySiteId: vi.fn().mockReturnValue([skill]), + getAll: vi.fn().mockReturnValue([skill]), + getActive: vi.fn().mockReturnValue([skill]), } as any, }); const tools = buildToolList(deps); - // At least meta + browser + skill - expect(tools.length).toBeGreaterThanOrEqual(10); // 8 meta + 1 browser + 1 skill + expect(tools.length).toBeGreaterThanOrEqual(10); }); it('admin caller dynamic skills have descriptions trimmed to maxDescriptionLength', () => { @@ -479,7 +483,6 @@ describe('tool-dispatch', () => { const tools = buildToolList(deps); const skillTool = tools.find(t => t.name === 'example.com.get_users.v1'); expect(skillTool).toBeDefined(); - // Description should be trimmed: 200 chars + '...' = 203 max expect(skillTool!.description.length).toBeLessThanOrEqual(203); expect(skillTool!.description.endsWith('...')).toBe(true); }); @@ -772,6 +775,48 @@ describe('tool-dispatch', () => { expect(data.success).toBe(false); expect(data.error).toBe('timeout'); }); + + it('returns browser_handoff_required without marking the tool call as an error', async () => { + const skill = makeSkill(); + const mockExecute = vi.fn().mockResolvedValue({ + success: false, + status: 'browser_handoff_required', + reason: 'cloudflare_challenge', + recoveryMode: 'real_browser_cdp', + siteId: 'example.com', + url: 'https://example.com/cdn-cgi/challenge-platform', + hint: 'Cloudflare challenge detected.', + resumeToken: 'recover-token', + latencyMs: 400, + }); + const deps = makeDeps({ + engine: { + getStatus: vi.fn().mockReturnValue({ mode: 'idle' }), + executeSkill: mockExecute, + getSessionManager: vi.fn().mockReturnValue({ + getBrowserManager: vi.fn().mockReturnValue({}), + }), + getMultiSessionManager: vi.fn().mockReturnValue(makeMultiSessionMock()), + } as any, + skillRepo: { + getByStatus: vi.fn().mockReturnValue([skill]), + getById: vi.fn().mockReturnValue(skill), + getBySiteId: vi.fn().mockReturnValue([skill]), + getAll: vi.fn().mockReturnValue([skill]), + getActive: vi.fn().mockReturnValue([skill]), + } as any, + confirmation: { + isSkillConfirmed: vi.fn().mockReturnValue(true), + } as any, + }); + + const result = await dispatchToolCall('schrute_execute', { skillId: skill.id }, deps); + + expect(result.isError).toBeUndefined(); + const data = JSON.parse(result.content[0].text); + expect(data.status).toBe('browser_handoff_required'); + expect(data.resumeToken).toBe('recover-token'); + }); }); // ─── Grouped skills output ──────────────────────────────────── @@ -1085,7 +1130,16 @@ describe('tool-dispatch', () => { }); it('buildToolList for stdio caller when server.network=true includes all tools', () => { - const deps = makeNetworkDeps(); + const skill = makeSkill(); + const deps = makeNetworkDeps({ + skillRepo: { + getByStatus: vi.fn().mockReturnValue([skill]), + getById: vi.fn().mockReturnValue(skill), + getBySiteId: vi.fn().mockReturnValue([skill]), + getAll: vi.fn().mockReturnValue([skill]), + getActive: vi.fn().mockReturnValue([skill]), + } as any, + }); const tools = buildToolList(deps, 'stdio'); const names = tools.map(t => t.name); @@ -1097,6 +1151,7 @@ describe('tool-dispatch', () => { // Should include browser tools expect(names).toContain('browser_click'); + expect(names).toContain('example.com.get_users.v1'); }); it('buildToolList for MCP HTTP caller when server.network=false includes all tools', () => { @@ -1146,7 +1201,11 @@ describe('tool-dispatch', () => { const result = await dispatchToolCall('schrute_sessions', {}, deps, 'mcp-session-123'); const data = JSON.parse(result.content[0].text); expect(data).toEqual([]); - expect(multiMock.list).toHaveBeenCalledWith('mcp-session-123', expect.objectContaining({ server: { network: true } })); + expect(multiMock.list).toHaveBeenCalledWith( + 'mcp-session-123', + expect.objectContaining({ server: { network: true } }), + { includeInternal: false }, + ); }); }); @@ -1283,4 +1342,41 @@ describe('tool-dispatch', () => { expect(typeof data[0].snapshotFields).toBe('object'); }); }); + + // ─── Batch Execute Rate Limit Retry ────────────────────────── + describe('schrute_batch_execute', () => { + it('delegates to engine.executeBatch and preserves results', async () => { + const skill = makeSkill(); + const mockExecuteBatch = vi.fn().mockResolvedValue([ + { skillId: skill.id, success: true, data: { result: 'ok' } }, + ]); + + const deps = makeDeps({ + engine: { + getStatus: vi.fn().mockReturnValue({ mode: 'idle', activeSession: null }), + getMode: vi.fn().mockReturnValue('idle'), + getExploreSessionName: vi.fn().mockReturnValue('default'), + getRecordingSessionName: vi.fn().mockReturnValue(null), + executeSkill: vi.fn(), + executeBatch: mockExecuteBatch, + getSessionManager: vi.fn().mockReturnValue({ + getBrowserManager: vi.fn().mockReturnValue({ + hasContext: vi.fn().mockReturnValue(false), + }), + }), + getMultiSessionManager: vi.fn().mockReturnValue(makeMultiSessionMock()), + getMetricsRepo: vi.fn().mockReturnValue({ getRecentBySkillId: vi.fn().mockReturnValue([]) }), + } as any, + }); + + const actions = [{ skillId: skill.id, params: {} }]; + const result = await dispatchToolCall('schrute_batch_execute', { actions }, deps); + + expect(result.isError).toBeUndefined(); + const data = JSON.parse(result.content[0].text); + expect(data.batch).toBe(true); + expect(data.results[0].success).toBe(true); + expect(mockExecuteBatch).toHaveBeenCalledWith(actions, undefined); + }); + }); }); diff --git a/tests/unit/tool-registry.test.ts b/tests/unit/tool-registry.test.ts index 63eb47c..a590234 100644 --- a/tests/unit/tool-registry.test.ts +++ b/tests/unit/tool-registry.test.ts @@ -205,7 +205,8 @@ describe('tool-registry', () => { describe('rankToolsByIntent', () => { it('returns all skills when count <= k', () => { const skills = [makeSkill(), makeSkill({ id: 'x' })]; - const result = rankToolsByIntent(skills, 'anything', 10); + // query must lexically match so skills aren't filtered out + const result = rankToolsByIntent(skills, 'users', 10); expect(result).toHaveLength(2); }); @@ -294,19 +295,29 @@ describe('tool-registry', () => { it('boosts recently used skills', () => { const recentSkill = makeSkill({ id: 'recent', - name: 'a', + name: 'fetch_users', successRate: 0.5, lastUsed: Date.now() - 1000, }); const oldSkill = makeSkill({ id: 'old', - name: 'b', + name: 'fetch_users_old', successRate: 0.5, lastUsed: Date.now() - 48 * 60 * 60 * 1000, }); - const result = rankToolsByIntent([oldSkill, recentSkill], 'something', 1); + // query lexically matches both skills; recency should break the tie + const result = rankToolsByIntent([oldSkill, recentSkill], 'fetch', 1); expect(result[0].id).toBe('recent'); }); + + it('returns empty array when no skills lexically match the intent', () => { + const skills = [ + makeSkill({ id: 'a', name: 'login', description: 'Login to site', successRate: 0.9 }), + makeSkill({ id: 'b', name: 'settings', description: 'Site settings', successRate: 0.9 }), + ]; + const result = rankToolsByIntent(skills, 'zzzznotfound', 10); + expect(result).toHaveLength(0); + }); }); describe('getBrowserToolDefinitions', () => { diff --git a/tests/unit/transform.test.ts b/tests/unit/transform.test.ts new file mode 100644 index 0000000..177b99b --- /dev/null +++ b/tests/unit/transform.test.ts @@ -0,0 +1,126 @@ +import { describe, expect, it } from 'vitest'; +import { applyTransform } from '../../src/replay/transform.js'; + +describe('applyTransform', () => { + it('returns input unchanged when no transform is configured', async () => { + const result = await applyTransform({ ok: true }); + expect(result).toEqual({ + data: { ok: true }, + transformApplied: false, + }); + }); + + it('applies a jsonpath transform and unwraps a single match', async () => { + const result = await applyTransform( + { stats: [{ price: 10 }, { price: 20 }] }, + { type: 'jsonpath', expression: '$.stats[1].price', label: 'current_price' }, + ); + + expect(result.data).toBe(20); + expect(result.rawData).toBeUndefined(); + expect(result.transformApplied).toBe(true); + expect(result.label).toBe('current_price'); + }); + + it('returns an array for multi-match jsonpath expressions', async () => { + const result = await applyTransform( + { stats: [{ price: 10 }, { price: 20 }] }, + { type: 'jsonpath', expression: '$.stats[*].price' }, + ); + + expect(result.data).toEqual([10, 20]); + }); + + it('applies regex transforms with capture groups', async () => { + const result = await applyTransform( + 'price=123.45 currency=USD', + { type: 'regex', expression: 'price=(\\d+\\.\\d+) currency=(\\w+)' }, + ); + + expect(result.data).toEqual(['123.45', 'USD']); + }); + + it('applies regex transforms with global flags', async () => { + const result = await applyTransform( + 'BTC ETH SOL', + { type: 'regex', expression: '([A-Z]{3})', flags: 'g' }, + ); + + expect(result.data).toEqual(['BTC', 'ETH', 'SOL']); + }); + + it('applies regex transforms with named capture groups', async () => { + const result = await applyTransform( + 'price=123.45 currency=USD', + { type: 'regex', expression: 'price=(?\\d+\\.\\d+) currency=(?\\w+)' }, + ); + + expect(result.data).toEqual({ price: '123.45', currency: 'USD' }); + }); + + it('rejects invalid regex flags before execution', async () => { + await expect(applyTransform( + 'price=123.45 currency=USD', + { type: 'regex', expression: 'price=(\\d+\\.\\d+)', flags: 'z' }, + )).rejects.toThrow("Invalid regex flag 'z'"); + }); + + it('rejects oversized regex inputs', async () => { + await expect(applyTransform( + 'a'.repeat(100_001), + { type: 'regex', expression: 'a+' }, + )).rejects.toThrow('Regex transform input exceeds 100000 characters'); + }); + + it('extracts text with a css transform', async () => { + const result = await applyTransform( + '

Top Story

', + { type: 'css', selector: 'h1', mode: 'text' }, + ); + + expect(result.data).toBe('Top Story'); + }); + + it('extracts html with a css transform', async () => { + const result = await applyTransform( + '
Hot
', + { type: 'css', selector: 'article', mode: 'html' }, + ); + + expect(result.data).toBe('Hot'); + }); + + it('extracts attributes with a css transform', async () => { + const result = await applyTransform( + 'Item', + { type: 'css', selector: 'a', mode: 'attr', attr: 'href' }, + ); + + expect(result.data).toBe('/item/1'); + }); + + it('extracts structured lists with css fields', async () => { + const html = ` + + `; + + const result = await applyTransform(html, { + type: 'css', + selector: '.story', + mode: 'list', + fields: { + title: { selector: '.title', mode: 'text' }, + href: { selector: '.title', mode: 'attr', attr: 'href' }, + score: { selector: '.score', mode: 'text' }, + }, + }); + + expect(result.data).toEqual([ + { title: 'Alpha', href: '/a', score: '10' }, + { title: 'Beta', href: '/b', score: '20' }, + ]); + }); +}); diff --git a/tests/unit/typescript-sdk.test.ts b/tests/unit/typescript-sdk.test.ts index c861603..755abee 100644 --- a/tests/unit/typescript-sdk.test.ts +++ b/tests/unit/typescript-sdk.test.ts @@ -144,6 +144,31 @@ describe('SchruteClient', () => { }), ); }); + + it('returns the browser handoff union when execution needs interactive recovery', async () => { + const response = { + status: 'browser_handoff_required', + success: false, + reason: 'cloudflare_challenge', + recoveryMode: 'real_browser_cdp', + siteId: 'example.com', + url: 'https://example.com/cdn-cgi/challenge-platform', + hint: 'Cloudflare challenge detected.', + resumeToken: 'recover-token', + latencyMs: 250, + }; + mockFetch.mockResolvedValueOnce(mockResponse(response, 202)); + + const result = await client.executeSkill('example.com', 'getUser', { + userId: '123', + }); + + expect(result.status).toBe('browser_handoff_required'); + if (result.status === 'browser_handoff_required') { + expect(result.resumeToken).toBe('recover-token'); + expect(result.reason).toBe('cloudflare_challenge'); + } + }); }); describe('dryRun', () => { diff --git a/tests/unit/workflow-executor.test.ts b/tests/unit/workflow-executor.test.ts new file mode 100644 index 0000000..2eade5e --- /dev/null +++ b/tests/unit/workflow-executor.test.ts @@ -0,0 +1,531 @@ +import { describe, expect, it, vi } from 'vitest'; +import { executeWorkflow, type WorkflowStepCacheStore } from '../../src/replay/workflow-executor.js'; +import { SideEffectClass, SkillStatus, type SkillSpec, type WorkflowSpec } from '../../src/skill/types.js'; + +function makeSkill(id: string, overrides: Partial = {}): SkillSpec { + return { + id, + version: 1, + status: SkillStatus.ACTIVE, + currentTier: 'tier_1', + tierLock: null, + allowedDomains: ['example.com'], + requiredCapabilities: [], + parameters: [], + validation: { semanticChecks: [], customInvariants: [] }, + redaction: { piiClassesFound: [], fieldsRedacted: 0 }, + replayStrategy: 'prefer_tier_1', + sideEffectClass: SideEffectClass.READ_ONLY, + sampleCount: 1, + consecutiveValidations: 1, + confidence: 1, + method: 'GET', + pathTemplate: `/${id}`, + inputSchema: {}, + isComposite: false, + siteId: 'example.com', + name: id, + successRate: 1, + createdAt: Date.now(), + updatedAt: Date.now(), + ...overrides, + } as SkillSpec; +} + +function makeRepo(skills: Record) { + return { + getById: (id: string) => skills[id], + }; +} + +describe('workflow-executor', () => { + it('passes data from $prev references across steps', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' } }, + { skillId: 'detail', paramMapping: { id: '$prev.data.id' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { id: 'user-1' }, latencyMs: 1 }) + .mockResolvedValueOnce({ success: true, data: { name: 'Ada' }, latencyMs: 1 }); + + const result = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + + expect(result.success).toBe(true); + if (result.success) { + expect(result.data).toEqual({ name: 'Ada' }); + } + expect(executeStep).toHaveBeenNthCalledWith(1, 'search', { query: 'ada' }); + expect(executeStep).toHaveBeenNthCalledWith(2, 'detail', { id: 'user-1' }); + }); + + it('passes data from named $steps references', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' } }, + { skillId: 'detail', name: 'detail', paramMapping: { id: '$steps.search.data.results[0].id' } }, + { skillId: 'summary', paramMapping: { text: '$steps.detail.data.summary' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + summary: makeSkill('summary'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { results: [{ id: 'r1' }] }, latencyMs: 1 }) + .mockResolvedValueOnce({ success: true, data: { summary: 'ok' }, latencyMs: 1 }) + .mockResolvedValueOnce({ success: true, data: { done: true }, latencyMs: 1 }); + + const result = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + expect(result.success).toBe(true); + expect(executeStep).toHaveBeenNthCalledWith(2, 'detail', { id: 'r1' }); + expect(executeStep).toHaveBeenNthCalledWith(3, 'summary', { text: 'ok' }); + }); + + it('returns partial results when a later step fails', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' } }, + { skillId: 'detail', name: 'detail', paramMapping: { id: '$prev.data.id' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { id: 'user-1' }, latencyMs: 1 }) + .mockResolvedValueOnce({ success: false, error: 'detail failed', failureCause: 'unknown', latencyMs: 1 }); + + const result = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + + expect(result.success).toBe(false); + if (!result.success) { + expect(result.failedAtStep).toBe('detail'); + expect(result.data.steps).toHaveLength(2); + expect(result.data.steps[0].success).toBe(true); + expect(result.data.steps[1].success).toBe(false); + } + }); + + it('fails loudly on undefined param paths', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' } }, + { skillId: 'detail', name: 'detail', paramMapping: { id: '$steps.search.data.missing.id' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { results: [] }, latencyMs: 1 }); + + const result = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + + expect(result.success).toBe(false); + if (!result.success) { + expect(result.error).toContain("resolved to undefined"); + expect(result.failedAtStep).toBe('detail'); + } + }); + + it('applies per-step transforms without double-transforming later lookups', async () => { + const workflow: WorkflowSpec = { + steps: [ + { + skillId: 'search', + name: 'search', + paramMapping: { query: '$initial.query' }, + transform: { type: 'jsonpath', expression: '$.items[0].id' }, + }, + { skillId: 'detail', paramMapping: { id: '$prev.data' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { items: [{ id: 'user-1' }] }, latencyMs: 1 }) + .mockResolvedValueOnce({ success: true, data: { done: true }, latencyMs: 1 }); + + const result = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + + expect(result.success).toBe(true); + expect(executeStep).toHaveBeenNthCalledWith(2, 'detail', { id: 'user-1' }); + }); + + it('rejects write skills during preflight', async () => { + const workflow: WorkflowSpec = { + steps: [{ skillId: 'write-step' }], + }; + const repo = makeRepo({ + 'write-step': makeSkill('write-step', { sideEffectClass: SideEffectClass.NON_IDEMPOTENT }), + }); + const executeStep = vi.fn(); + + const result = await executeWorkflow(workflow, {}, executeStep, repo as any); + expect(result.success).toBe(false); + if (!result.success) { + expect(result.error).toContain('not read-only'); + } + expect(executeStep).not.toHaveBeenCalled(); + }); + + it('rejects nested workflows during preflight', async () => { + const workflow: WorkflowSpec = { + steps: [{ skillId: 'nested' }], + }; + const repo = makeRepo({ + nested: makeSkill('nested', { workflowSpec: { steps: [] } }), + }); + + const result = await executeWorkflow(workflow, {}, vi.fn(), repo as any); + expect(result.success).toBe(false); + if (!result.success) { + expect(result.error).toContain('cannot reference another workflow'); + } + }); + + it('rejects inactive step skills during preflight', async () => { + const workflow: WorkflowSpec = { + steps: [{ skillId: 'draft-step' }], + }; + const repo = makeRepo({ + 'draft-step': makeSkill('draft-step', { status: SkillStatus.DRAFT }), + }); + + const result = await executeWorkflow(workflow, {}, vi.fn(), repo as any); + expect(result.success).toBe(false); + if (!result.success) { + expect(result.error).toContain('not active'); + } + }); + + it('rejects unknown named step references during preflight', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'detail', name: 'detail', paramMapping: { id: '$steps.search.data.id' } }, + ], + }; + const repo = makeRepo({ + detail: makeSkill('detail'), + }); + const executeStep = vi.fn(); + + const result = await executeWorkflow(workflow, {}, executeStep, repo as any); + + expect(result.success).toBe(false); + if (!result.success) { + expect(result.error).toContain("unknown step 'search'"); + expect(result.failedAtStep).toBe('detail'); + } + expect(executeStep).not.toHaveBeenCalled(); + }); + + it('rejects $prev references in the first step during preflight', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'detail', name: 'detail', paramMapping: { id: '$prev.data.id' } }, + ], + }; + const repo = makeRepo({ + detail: makeSkill('detail'), + }); + const executeStep = vi.fn(); + + const result = await executeWorkflow(workflow, {}, executeStep, repo as any); + + expect(result.success).toBe(false); + if (!result.success) { + expect(result.error).toContain('there is no previous step result'); + expect(result.failedAtStep).toBe('detail'); + } + expect(executeStep).not.toHaveBeenCalled(); + }); + + it('propagates browser handoff results from a step without wrapping', async () => { + const workflow: WorkflowSpec = { + steps: [{ skillId: 'search' }], + }; + const repo = makeRepo({ + search: makeSkill('search'), + }); + + const result = await executeWorkflow(workflow, {}, vi.fn().mockResolvedValue({ + success: false, + status: 'browser_handoff_required', + siteId: 'example.com', + url: 'https://example.com', + hint: 'Complete login', + latencyMs: 1, + }), repo as any); + + expect('status' in result && result.status === 'browser_handoff_required').toBe(true); + if ('status' in result && result.status === 'browser_handoff_required') { + expect(result.hint).toBe('Complete login'); + } + }); + + it('retries a rate-limited workflow step once before succeeding', async () => { + vi.useFakeTimers(); + + try { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' } }, + { skillId: 'detail', name: 'detail', paramMapping: { id: '$prev.data.id' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { id: 'user-1' }, latencyMs: 1 }) + .mockResolvedValueOnce({ + success: false, + error: 'rate limited', + failureCause: 'rate_limited', + failureDetail: 'Retry after 100ms', + latencyMs: 1, + }) + .mockResolvedValueOnce({ success: true, data: { name: 'Ada' }, latencyMs: 1 }); + + const resultPromise = executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + await vi.advanceTimersByTimeAsync(150); + const result = await resultPromise; + + expect(result.success).toBe(true); + if (result.success) { + expect(result.data).toEqual({ name: 'Ada' }); + } + expect(executeStep).toHaveBeenCalledTimes(3); + expect(executeStep).toHaveBeenNthCalledWith(2, 'detail', { id: 'user-1' }); + expect(executeStep).toHaveBeenNthCalledWith(3, 'detail', { id: 'user-1' }); + } finally { + vi.useRealTimers(); + } + }); + + it('returns the final rate-limit failure when the retry also fails', async () => { + vi.useFakeTimers(); + + try { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' } }, + { skillId: 'detail', name: 'detail', paramMapping: { id: '$prev.data.id' } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + detail: makeSkill('detail'), + }); + const executeStep = vi.fn() + .mockResolvedValueOnce({ success: true, data: { id: 'user-1' }, latencyMs: 1 }) + .mockResolvedValueOnce({ + success: false, + error: 'rate limited', + failureCause: 'rate_limited', + failureDetail: 'Retry after 100ms', + latencyMs: 1, + }) + .mockResolvedValueOnce({ + success: false, + error: 'still rate limited', + failureCause: 'rate_limited', + failureDetail: 'Retry after 100ms', + latencyMs: 1, + }); + + const resultPromise = executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + await vi.advanceTimersByTimeAsync(150); + const result = await resultPromise; + + expect(result.success).toBe(false); + if (!result.success) { + expect(result.failedAtStep).toBe('detail'); + expect(result.error).toBe('still rate limited'); + expect(result.failureCause).toBe('rate_limited'); + } + expect(executeStep).toHaveBeenCalledTimes(3); + } finally { + vi.useRealTimers(); + } + }); + + it('reuses cached step results within the ttl window across workflow runs', async () => { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search-1', paramMapping: { query: '$initial.query' }, cache: { ttlMs: 1_000 } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + }); + const executeStep = vi.fn().mockResolvedValue({ + success: true, + data: { items: [{ id: 'user-1' }] }, + latencyMs: 1, + }); + const cache: WorkflowStepCacheStore = new Map(); + + const first = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any, cache); + const second = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any, cache); + expect(first.success).toBe(true); + expect(second.success).toBe(true); + expect(executeStep).toHaveBeenCalledTimes(1); + }); + + it('does not reuse cached entries across workflows when ttl contracts differ', async () => { + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-01-01T00:00:00.000Z')); + + try { + const longTtlWorkflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' }, cache: { ttlMs: 60_000 } }, + ], + }; + const shortTtlWorkflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' }, cache: { ttlMs: 1_000 } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + }); + const executeStep = vi.fn().mockResolvedValue({ + success: true, + data: { items: [{ id: 'user-1' }] }, + latencyMs: 1, + }); + const cache: WorkflowStepCacheStore = new Map(); + + await executeWorkflow(longTtlWorkflow, { query: 'ada' }, executeStep, repo as any, cache); + vi.advanceTimersByTime(1_500); + await executeWorkflow(shortTtlWorkflow, { query: 'ada' }, executeStep, repo as any, cache); + + expect(executeStep).toHaveBeenCalledTimes(2); + } finally { + vi.useRealTimers(); + } + }); + + it('reuses cached entries across workflows when a later ttl contract is longer and the result is still fresh', async () => { + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-01-01T00:00:00.000Z')); + + try { + const shortTtlWorkflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' }, cache: { ttlMs: 1_000 } }, + ], + }; + const longTtlWorkflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' }, cache: { ttlMs: 60_000 } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + }); + const executeStep = vi.fn().mockResolvedValue({ + success: true, + data: { items: [{ id: 'user-1' }] }, + latencyMs: 1, + }); + const cache: WorkflowStepCacheStore = new Map(); + + await executeWorkflow(shortTtlWorkflow, { query: 'ada' }, executeStep, repo as any, cache); + vi.advanceTimersByTime(500); + await executeWorkflow(longTtlWorkflow, { query: 'ada' }, executeStep, repo as any, cache); + + expect(executeStep).toHaveBeenCalledTimes(1); + } finally { + vi.useRealTimers(); + } + }); + + it('prunes expired cache entries even when later workflows use different keys', async () => { + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-01-01T00:00:00.000Z')); + + try { + const workflow: WorkflowSpec = { + steps: [ + { skillId: 'search', name: 'search', paramMapping: { query: '$initial.query' }, cache: { ttlMs: 1_000 } }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + }); + const executeStep = vi.fn().mockResolvedValue({ + success: true, + data: { items: [{ id: 'user-1' }] }, + latencyMs: 1, + }); + const cache: WorkflowStepCacheStore = new Map(); + + await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any, cache); + expect(Array.from(cache.entries())).toHaveLength(1); + + vi.advanceTimersByTime(24 * 60 * 60 * 1000 + 1); + await executeWorkflow(workflow, { query: 'grace' }, executeStep, repo as any, cache); + + const entries = Array.from(cache.entries()); + expect(entries).toHaveLength(1); + expect(entries[0][0]).toContain('"grace"'); + } finally { + vi.useRealTimers(); + } + }); + + it('applies per-step transforms after cache lookup so steps do not contaminate each other', async () => { + const workflow: WorkflowSpec = { + steps: [ + { + skillId: 'search', + name: 'price', + paramMapping: { query: '$initial.query' }, + cache: { ttlMs: 1_000 }, + transform: { type: 'regex', expression: 'price=(?\\d+)' }, + }, + { + skillId: 'search', + name: 'currency', + paramMapping: { query: '$initial.query' }, + cache: { ttlMs: 1_000 }, + transform: { type: 'regex', expression: 'currency=(?[A-Z]+)' }, + }, + ], + }; + const repo = makeRepo({ + search: makeSkill('search'), + }); + const executeStep = vi.fn().mockResolvedValue({ + success: true, + data: 'price=123 currency=USD', + latencyMs: 1, + }); + + const result = await executeWorkflow(workflow, { query: 'ada' }, executeStep, repo as any); + + expect(result.success).toBe(true); + expect(executeStep).toHaveBeenCalledTimes(1); + if (result.success) { + expect(result.stepResults[0].data).toEqual({ price: '123' }); + expect(result.stepResults[1].data).toEqual({ currency: 'USD' }); + expect(result.data).toEqual({ currency: 'USD' }); + } + }); +}); diff --git a/vitest.config.ts b/vitest.config.ts index 6ae9856..236aa95 100644 --- a/vitest.config.ts +++ b/vitest.config.ts @@ -8,7 +8,9 @@ export default defineConfig({ test: { globals: true, environment: 'node', + globalSetup: ['tests/global-setup.ts'], include: ['tests/**/*.test.ts'], + exclude: ['tests/live/**'], coverage: { provider: 'v8', include: ['src/**/*.ts'], diff --git a/vitest.live.config.ts b/vitest.live.config.ts new file mode 100644 index 0000000..266c601 --- /dev/null +++ b/vitest.live.config.ts @@ -0,0 +1,28 @@ +/** + * Vitest config for live integration tests. + * + * Usage: npx vitest run --config vitest.live.config.ts + * Or: npx vitest run --config vitest.live.config.ts tests/live/httpbin.test.ts + * + * These tests hit real HTTP endpoints — do NOT run in CI. + */ +import { defineConfig } from 'vitest/config'; +import path from 'path'; +import { fileURLToPath } from 'url'; + +const __dirname = path.dirname(fileURLToPath(import.meta.url)); + +export default defineConfig({ + test: { + globals: true, + environment: 'node', + include: ['tests/live/**/*.test.ts'], + testTimeout: 30000, + hookTimeout: 30000, + }, + resolve: { + alias: { + '@schrute': path.resolve(__dirname, 'src'), + }, + }, +});