steipete · xrliAnnie · Mar 8, 2026 · Mar 8, 2026 · Mar 8, 2026 · Mar 8, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,108 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+Oracle (`@steipete/oracle`) is a CLI tool that wraps OpenAI's Responses API to query multiple AI models (GPT-5.x, Gemini 3.x, Claude 4.x) with file context. It supports API mode, browser automation (ChatGPT/Gemini via Chrome DevTools Protocol), MCP server integration, and remote bridge execution.
+
+## Commands
+
+```bash
+# Package manager: pnpm (10.23.0)
+pnpm install              # Install dependencies
+
+# Build
+pnpm run build            # TypeScript compile + copy vendor files
+
+# Lint & Format (oxlint + oxfmt, NOT ESLint/Prettier)
+pnpm run check            # format:check + lint (runs in CI)
+pnpm run lint             # typecheck + oxlint
+pnpm run lint:fix         # oxlint --fix + oxfmt
+pnpm run format           # oxfmt --write
+pnpm run typecheck        # tsc --noEmit
+
+# Tests (Vitest)
+pnpm test                 # Run all unit tests
+pnpm vitest run tests/oracle/run.test.ts          # Single test file
+pnpm vitest run -t "test name pattern"            # Single test by name
+pnpm test:coverage        # Unit tests with v8 coverage
+pnpm test:mcp             # Build + MCP unit + mcporter integration
+pnpm test:browser         # Browser automation smokes (needs Chrome on port 45871)
+ORACLE_LIVE_TEST=1 pnpm test:live                 # Live API tests (costs real tokens)
+ORACLE_LIVE_TEST=1 pnpm test:pro                  # Pro model tests (10+ min)
+```
+
+## Architecture
+
+```
+bin/
+  oracle-cli.ts          # CLI entry point (commander-based, 1700+ lines)
+  oracle-mcp.ts          # MCP server entry point
+
+src/
+  oracle/                # Core engine
+    run.ts               # Main orchestrator — assembles prompt, calls API, streams response
+    client.ts            # API client factory (OpenAI, Azure, Gemini, custom endpoints)
+    modelResolver.ts     # Model name → provider routing logic
+    files.ts             # File globbing + token estimation
+    multiModelRunner.ts  # Parallel multi-model execution
+    gemini.ts / claude.ts  # Provider-specific adapters
+
+  browser/               # Chrome DevTools Protocol automation
+    index.ts             # Core browser orchestrator (largest file)
+    chromeLifecycle.ts   # Chrome launch/teardown via chrome-launcher
+    cookies.ts           # Cookie sync (sweet-cookie for macOS Keychain)
+    reattach.ts          # Session recovery on navigation/crash
+    actions/             # DOM interaction modules
+      assistantResponse.ts   # Capture AI response from page
+      attachments.ts         # File/image upload automation
+      promptComposer.ts      # Type prompt into chat input
+      modelSelection.ts      # Pick model from ChatGPT dropdown
+      navigation.ts          # URL/iframe handling
+    providers/           # DOM selector definitions per site
+      chatgptDomProvider.ts
+      geminiDeepThinkDomProvider.ts
+
+  cli/                   # CLI layer
+    options.ts           # Commander option definitions
+    sessionRunner.ts     # Executes a single oracle run
+    sessionDisplay.ts    # Terminal output rendering
+    browserConfig.ts     # Browser flag aggregation
+    tui/                 # Interactive terminal UI (excluded from coverage)
+
+  gemini-web/            # Browser-based Gemini client (no API key needed)
+  remote/                # Remote Chrome bridge (server + client)
+  bridge/                # MCP/Codex bridge connection
+  mcp/                   # Model Context Protocol server + tools
+  sessionManager.ts      # Session CRUD (stored in ~/.oracle/sessions/)
+  config.ts              # Global config (~/.oracle/config.json, JSON5)
+```
+
+### Key Patterns
+
+- **Engine selection**: API (default when `OPENAI_API_KEY` set) vs Browser (Chrome automation). Controlled by `--engine api|browser` or `ORACLE_ENGINE` env var.
+- **Model routing**: `modelResolver.ts` maps model strings to providers. Supports OpenAI, Azure OpenAI, Gemini (API + web), Claude, OpenRouter, Grok, and custom endpoints.
+- **Session persistence**: Every run creates a session under `~/.oracle/sessions/<id>/` with metadata, prompt, and response. Sessions can be listed (`oracle status`), replayed (`oracle session <id>`), or restarted (`oracle restart <id>`).
+- **Path aliases**: `@src/*` → `src/*`, `@tests/*` → `tests/*` (configured in tsconfig.json and vitest.config.ts).
+
+## Code Style
+
+- **Formatter**: oxfmt — 2 spaces, 100 char width, double quotes, trailing commas, semicolons.
+- **Linter**: oxlint with plugins: unicorn, typescript, oxc. Categories correctness/perf/suspicious = error.
+- **TypeScript**: Strict mode, ES2022 target, ESNext modules, bundler resolution.
+- **Module system**: ESM (`"type": "module"` in package.json). Use `.ts` extensions in imports.
+
+## Testing Notes
+
+- Test setup (`tests/setup-env.ts`) injects fake API keys and isolates session storage to `/tmp/oracle-tests-{pid}`. Non-live tests never hit real APIs.
+- Live tests are opt-in via `ORACLE_LIVE_TEST=1` env var and require real API keys.
+- Browser smoke tests expect Chrome on DevTools port 45871.
+- MCP tests require building first (`pnpm run build`).
+
+## AGENTS.md Highlights
+
+- CLI banner uses the oracle emoji: `🧿 oracle (<version>) ...` — only on initial headline and TUI exit.
+- Browser Pro runs: never click "Answer now" — wait for the real response (up to 10 min).
+- Before release, check `docs/manual-tests.md` for relevant smoke tests.
+- After finishing a feature, update CHANGELOG if it affects end users (read top ~100 lines first, group related edits).
diff --git a/bin/oracle-cli.ts b/bin/oracle-cli.ts
@@ -142,6 +142,7 @@ interface CliOptions extends OptionValues {
   browserManualLogin?: boolean;
   browserManualLoginProfileDir?: string;
   browserThinkingTime?: "light" | "standard" | "extended" | "heavy";
+  deepResearch?: boolean;
   browserAllowCookieErrors?: boolean;
   browserAttachments?: string;
   browserInlineFiles?: boolean;
@@ -592,6 +593,13 @@ program
       .choices(["light", "standard", "extended", "heavy"])
       .hideHelp(),
   )
+  .option(
+    "--deep-research",
+    "Use ChatGPT Deep Research mode (browser engine only). " +
+      "Activates autonomous web research that takes 5-30 minutes. " +
+      "Requires ChatGPT Plus or Pro subscription.",
+    false,
+  )
   .addOption(
     new Option(
       "--browser-allow-cookie-errors",
@@ -1328,6 +1336,17 @@ async function runRootCommand(options: CliOptions): Promise<void> {
     options.baseUrl = userConfig.apiBaseUrl;
   }
 
+  // --deep-research implies browser engine and validates constraints
+  if (options.deepResearch) {
+    if (engine !== "browser" && preferredEngine === "api") {
+      throw new Error("--deep-research requires --engine browser.");
+    }
+    engine = "browser";
+    if (options.models && options.models.length > 0) {
+      throw new Error("--deep-research cannot be combined with --models (multi-model runs).");
+    }
+  }
+
   if (remoteHost && engine !== "browser") {
     throw new Error("--remote-host requires --engine browser.");
   }

diff --git a/docs/deep-research-plan/00-overview.md b/docs/deep-research-plan/00-overview.md
@@ -0,0 +1,117 @@
+# Deep Research Browser Automation — Implementation Plan
+
+## Goal
+
+Add ChatGPT Deep Research support to Oracle's browser automation engine, enabling users to trigger Deep Research from the CLI and receive structured research reports — all using their existing ChatGPT subscription (no API cost).
+
+## Motivation
+
+- ChatGPT Deep Research is a powerful autonomous research agent that browses the web for 5-30 minutes and produces comprehensive cited reports
+- OpenAI offers a Deep Research API (`o3-deep-research`, `o4-mini-deep-research`), but it costs ~$10/M input + $40/M output tokens per run
+- Users with ChatGPT Plus/Pro subscriptions already have Deep Research included — browser automation lets them use it programmatically at no extra cost
+- Oracle already has mature ChatGPT browser automation; extending it for Deep Research is a natural fit
+
+## Usage
+
+```bash
+# Basic Deep Research
+oracle --deep-research -p "Research the latest trends in AI agent frameworks in 2026"
+
+# With file context
+oracle --deep-research -p "Analyze this codebase architecture" --file "src/**/*.ts"
+
+# With custom timeout (default 40 minutes)
+oracle --deep-research --timeout 60m -p "Comprehensive market analysis of EV industry"
+```
+
+## Architecture Decision: Iframe Handling
+
+The research plan confirmation UI renders in a **cross-origin iframe** (640x400px), making direct DOM manipulation from the main page impossible. Three options were evaluated:
+
+| Option | Approach | Complexity | Robustness |
+|--------|----------|------------|------------|
+| **A. Wait for auto-confirm** | Start button has ~60s countdown that auto-confirms | Low | High |
+| B. CDP iframe targeting | Use `Target.getTargets()` to find iframe execution context | High | Medium |
+| C. Coordinate-based clicking | Use `Input.dispatchMouseEvent` at computed coordinates | Medium | Low |
+
+**Decision: Option A.** The auto-confirm countdown eliminates the need to interact with the iframe at all. After detecting the iframe appears, simply wait ~70 seconds for auto-confirmation. This is the most robust approach and matches natural user behavior.
+
+## Implementation Phases
+
+| Phase | Scope | Doc |
+|-------|-------|-----|
+| 1 | Types, Config, CLI Flag | [01-types-and-config.md](01-types-and-config.md) |
+| 2 | Core Action Module (`deepResearch.ts`) | [02-core-actions.md](02-core-actions.md) |
+| 3 | Main Flow Integration (`index.ts`) | [03-flow-integration.md](03-flow-integration.md) |
+| 4 | Reattach & Session Support | [04-reattach-and-sessions.md](04-reattach-and-sessions.md) |
+| 5 | Testing Strategy | [05-testing.md](05-testing.md) |
+
+## UI Flow (Discovered via Live Exploration)
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│ Phase 1: Activate Deep Research Mode                        │
+│                                                             │
+│  [+] button → radix dropdown → "Deep research" item         │
+│  Result: "Thinking" pill → "Deep research" pill             │
+│          + "Apps" and "Sites" buttons appear                 │
+└────────────────────────┬────────────────────────────────────┘
+                         │
+┌────────────────────────▼────────────────────────────────────┐
+│ Phase 2: Submit Prompt                                      │
+│                                                             │
+│  Type prompt in textbox → click [send-button]               │
+│  URL changes to /c/{conversation-id}                        │
+└────────────────────────┬────────────────────────────────────┘
+                         │
+┌────────────────────────▼────────────────────────────────────┐
+│ Phase 3: Research Plan (CROSS-ORIGIN IFRAME)                │
+│                                                             │
+│  ┌──────────────────────────────────────┐                   │
+│  │ "AI agent frameworks trends"         │                   │
+│  │ ○ Survey academic papers...          │                   │
+│  │ ○ Review documentation...            │                   │
+│  │ ○ Analyze blog posts...              │                   │
+│  │                                      │                   │
+│  │ [Edit]  [Cancel]  [Start (53)]       │                   │
+│  └──────────────────────────────────────┘                   │
+│  Auto-confirms after ~60 second countdown                   │
+└────────────────────────┬────────────────────────────────────┘
+                         │
+┌────────────────────────▼────────────────────────────────────┐
+│ Phase 4: Research Execution (5-30 minutes)                  │
+│                                                             │
+│  Status updates in iframe: "Researching..."                 │
+│  "Considering methods for framework comparison..."          │
+│  [Update] button visible in iframe                          │
+└────────────────────────┬────────────────────────────────────┘
+                         │
+┌────────────────────────▼────────────────────────────────────┐
+│ Phase 5: Report Complete                                    │
+│                                                             │
+│  Iframe disappears, full markdown report in conversation    │
+│  Copy/Rate buttons appear (FINISHED_ACTIONS_SELECTOR)       │
+│  Extract text via existing assistantResponse.ts             │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Key DOM Selectors
+
+| Element | Selector | Notes |
+|---------|----------|-------|
+| "+" button | `[data-testid="composer-plus-btn"]` | Opens radix dropdown |
+| Deep Research menu item | `[data-radix-collection-item]` text="Deep research" | No `data-testid` |
+| Deep Research pill | `.__composer-pill-composite` with aria "Deep research" | Replaces Thinking pill |
+| Send button | `[data-testid="send-button"]` | Same as normal chat |
+| Research plan iframe | `iframe.h-full.w-full` inside assistant turn | Cross-origin |
+| Completion indicator | `FINISHED_ACTIONS_SELECTOR` (copy/rate buttons) | Existing constant |
+
+## Risks and Mitigations
+
+| Risk | Mitigation |
+|------|-----------|
+| ChatGPT changes Deep Research UI selectors | Use text-match "Deep research" as primary; multiple fallback selectors |
+| Auto-confirm timer changes | Detect confirmation via iframe state change, not fixed timer |
+| Research exceeds timeout | Default 40min timeout; `--timeout` override; reattach mechanism for interrupted runs |
+| "+" button `data-testid` changes | Fallback: `button[aria-label*="Add files"]`, positional matching |
+| Deep Research unavailable for account tier | Clear error message with subscription requirement info |