Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion bin/oracle-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ if (process.argv[2] === "oracle-mcp") {
}
import { resolveEngine, type EngineMode, defaultWaitPreference } from "../src/cli/engine.js";
import { shouldRequirePrompt } from "../src/cli/promptRequirement.js";
import { resolveDashPrompt } from "../src/cli/stdin.js";
import chalk from "chalk";
import type { SessionMetadata, SessionMode, BrowserSessionConfig } from "../src/sessionStore.js";
import { sessionStore, pruneOldSessions } from "../src/sessionStore.js";
Expand Down Expand Up @@ -216,7 +217,7 @@ program.hook("preAction", () => {
introPrinted = true;
});
applyHelpStyling(program, VERSION, isTty);
program.hook("preAction", (thisCommand) => {
program.hook("preAction", async (thisCommand) => {
if (thisCommand !== program) {
return;
}
Expand All @@ -234,6 +235,11 @@ program.hook("preAction", (thisCommand) => {
opts.prompt = positional;
thisCommand.setOptionValue("prompt", positional);
}
const resolvedPrompt = await resolveDashPrompt(opts.prompt);
if (resolvedPrompt !== opts.prompt) {
opts.prompt = resolvedPrompt;
thisCommand.setOptionValue("prompt", resolvedPrompt);
}
if (shouldRequirePrompt(userCliArgs, opts)) {
console.log(
chalk.yellow('Prompt is required. Provide it via --prompt "<text>" or positional [prompt].'),
Expand Down
2 changes: 1 addition & 1 deletion docs/browser-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ You can pass the same payload inline (`--browser-inline-cookies '<json or base64
- `--browser-inline-files`: alias for `--browser-attachments never` (forces inline paste; never uploads attachments).
- `--browser-bundle-files`: bundle all resolved attachments into a single temp file before uploading (only used when uploads are enabled/selected).
- sqlite bindings: automatic rebuilds now require `ORACLE_ALLOW_SQLITE_REBUILD=1`. Without it, the CLI logs instructions instead of running `pnpm rebuild` on your behalf.
- `--model`: the same flag used for API runs is accepted, but the ChatGPT automation path supports GPT-5.4 and GPT-5.2 variants. Use `gpt-5.4-pro`, `gpt-5.4`, `gpt-5.2`, `gpt-5.2-thinking`, `gpt-5.2-instant`, or `gpt-5.2-pro`. Legacy Pro aliases still resolve to the latest Pro picker target.
- `--model`: the same flag used for API runs is accepted, but the ChatGPT automation path supports GPT-5.4 and GPT-5.2 variants. Use `gpt-5.4-pro`, `gpt-5.4`, `gpt-5.2`, `gpt-5.2-thinking`, `gpt-5.2-instant`, or `gpt-5.2-pro`. Any ChatGPT Pro alias resolves to the current Pro picker target, so versioned Pro labels may briefly appear in the UI but settle on the single available Pro entry.
- Cookie sync is mandatory—if we can’t copy cookies from Chrome, the run exits early. Use the hidden `--browser-allow-cookie-errors` flag only when you’re intentionally running logged out (it skips the early exit but still warns).
- Experimental cookie controls (hidden flags/env):
- `--browser-cookie-names <comma-list>` or `ORACLE_BROWSER_COOKIE_NAMES`: allowlist which cookies to sync. Useful for “only NextAuth/Cloudflare, drop the rest.”
Expand Down
4 changes: 2 additions & 2 deletions docs/debug/remote-chrome.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Outcome:
- After wiring fix, logs showed “Routing browser automation to remote host …” but requests failed with:
- `ECONNREFUSED 192.168.64.2:49810` when no service listening.
- `busy` when a previous service process was still bound.
- Later run reached remote path but failed model switch: `Unable to find model option matching "GPT-5.2 Pro"` (remote Chrome not logged into ChatGPT / model picker mismatch).
- Later run reached remote path but failed model switch: `Unable to find model option matching "GPT-5.4 Pro"` (remote Chrome not logged into ChatGPT / model picker mismatch).
- After disabling cookie shipping and requiring host login, remote runs now fail earlier: service logs “Loading ChatGPT cookies from host Chrome profile…” then reports `Unhandled promise rejection ... Unknown error` when `loadChromeCookies` runs on the VM. Remote client sees `socket hang up` because the server doesn’t deliver a result.

### 2) Remote service on VM
Expand All @@ -52,7 +52,7 @@ Actions taken on VM (tmux `vmssh`):

- Environment PATH: bun not on PATH for non-interactive shells caused `./runner` to fail; need to `export PATH="$HOME/.bun/bin:$PATH"` before starting service.
- Port collisions: prior listeners on 49810 caused ECONNREFUSED/busy.
- Remote model switch failed: remote Chrome likely not signed into ChatGPT; model picker couldn’t find “GPT-5.2 Pro”.
- Remote model switch failed: remote Chrome likely not signed into ChatGPT; model picker couldn’t find the current Pro picker target (“GPT-5.4 Pro” at the time).
- Keychain/cookie read now failing on VM: `loadChromeCookies` throws “Unknown error” when invoked from the server process (Node 25, SSH shell). When `oracle serve` runs from GUI Terminal it starts fine; under nohup/SSH it logs the rejection and remote runs hang.
- New behavior (post-fix): `oracle serve` exits early if it cannot load host ChatGPT cookies after opening chatgpt.com for login; sign in on the host and restart the service.

Expand Down
42 changes: 28 additions & 14 deletions docs/manual-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,23 +141,37 @@ Document results (pass/fail, session IDs) in PR descriptions so reviewers can au

Run these four smoke tests whenever we touch browser automation:

1. **GPT-5.2 simple prompt**
`pnpm run oracle -- --engine browser --model "GPT-5.2" --prompt "Give me two short markdown bullet points about tables"`
Expect two markdown bullets, no files/search referenced. Note the session ID (e.g., `give-me-two-short-markdown`).
Fast-path note:
- Tests 1-4 below are quick browser-path checks only. They use `gpt-5.2-instant`, which currently targets the ChatGPT Instant 5.3 picker. They are not a substitute for Pro validation.

2. **GPT-5.2 simple prompt**
`pnpm run oracle -- --engine browser --model gpt-5.2 --prompt "List two reasons Markdown is handy"`
Confirm the answer arrives (and only once) even if it takes ~2–3 minutes.
1. **Fast browser simple prompt**
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly one line and nothing else: pro-ok"`
Expect the answer body to contain `pro-ok` verbatim on its own line. Note the session ID.

3. **GPT-5.2 + attachment**
2. **Fast browser exact-line prompt**
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly these three lines and nothing else:\n\`\`\`js\nconsole.log('thinking-ok')\n\`\`\`"`
Confirm the answer includes the fenced `js` code block and `console.log('thinking-ok')` verbatim.

3. **Fast browser + attachment**
Prepare `/tmp/browser-md.txt` with a short note, then run
`pnpm run oracle -- --engine browser --model "GPT-5.2" --prompt "Summarize the key idea from the attached note" --file /tmp/browser-md.txt`
Ensure upload logs show “Attachment queued” and the answer references the file contents explicitly.
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly one line and nothing else: note=<paste the file contents exactly>" --file /tmp/browser-md.txt`
Ensure upload logs show “Attachment queued” and the answer contains `note=` plus the attached file contents exactly.

4. **GPT-5.2 + attachment (verbose)**
4. **Fast browser + attachment (verbose)**
Prepare `/tmp/browser-report.txt` with faux metrics, then run
`pnpm run oracle -- --engine browser --model gpt-5.2 --prompt "Use the attachment to report current CPU and memory figures" --file /tmp/browser-report.txt --verbose`
Verify verbose logs show attachment upload and the final answer matches the file data.
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly these two lines and nothing else:\nCPU=<value from file>\nMEMORY=<value from file>" --file /tmp/browser-report.txt --verbose`
Verify verbose logs show attachment upload and the final answer contains the exact CPU and memory values from the file.

### Pro browser smoke

Run these when the change might affect Pro-specific behavior, long thinking, or reattach.

1. **Pro markdown capture**
`pnpm run oracle -- --engine browser --model gpt-5.4-pro --prompt "Return exactly these three lines and nothing else:\n\`\`\`js\nconsole.log('thinking-ok')\n\`\`\`"`
Confirm the answer preserves the fenced `js` code block.

2. **Pro reattach flow**
Use `scripts/browser-smoke.sh` or run a manual `--browser-keep-browser` session with `gpt-5.4-pro`, then kill the controller and verify `oracle session <slug> --render-plain` still shows the expected answer.

Record session IDs and outcomes in the PR description (pass/fail, notable delays). This ensures reviewers can audit real runs.

Expand Down Expand Up @@ -248,8 +262,8 @@ These Vitest cases hit the real OpenAI API to exercise both transports:
export ORACLE_LIVE_TEST=1
pnpm vitest run tests/live/openai-live.test.ts
```
2. The first two tests target the standard GPT-5 (`gpt-5.1` / `gpt-5.2`) foreground
streaming paths. The later background tests send `gpt-5.4-pro` and `gpt-5.2-pro`
2. The first two tests target the current fast browser picker path (`gpt-5.2-instant` aliasing
to Instant 5.3). The later background tests send `gpt-5.4-pro` and `gpt-5.2-pro`
prompts and expect the CLI to stay in background mode until OpenAI finishes
(up to 30 minutes).
3. Watch the console for `Reconnected to OpenAI background response...` if
Expand Down
2 changes: 1 addition & 1 deletion docs/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

- Unit/type tests: `pnpm test` (Vitest) and `pnpm run check` (typecheck).
- Gemini unit/regression: `pnpm vitest run tests/gemini.test.ts tests/gemini-web`.
- Browser smokes: `pnpm test:browser` (builds, checks DevTools port 45871, then runs headful browser smokes with GPT-5.2 for most cases and GPT-5.4 Pro for the reattach + markdown checks). Requires a signed-in Chrome profile; runs headful but hides the window by default unless Chrome forces focus.
- Browser smokes: `pnpm test:browser` (builds, checks DevTools port 45871, then runs headful browser smokes with the `gpt-5.2-instant` alias for fast path checks and GPT-5.4 Pro for the actual Pro/reattach + markdown checks). Requires a signed-in Chrome profile; runs headful but hides the window by default unless Chrome forces focus.
- Live API smokes: `ORACLE_LIVE_TEST=1 OPENAI_API_KEY=… pnpm test:live` (excludes OpenAI pro), `ORACLE_LIVE_TEST=1 OPENAI_API_KEY=… pnpm test:pro` (OpenAI pro live). Expect real usage/cost.
- Gemini web (cookie) live smoke: `ORACLE_LIVE_TEST=1 pnpm vitest run tests/live/gemini-web-live.test.ts` (requires a signed-in Chrome profile at `gemini.google.com`).
- MCP focused: `pnpm test:mcp` (builds then stdio smoke via mcporter).
Expand Down
31 changes: 29 additions & 2 deletions scripts/browser-smoke-upload-only.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,39 @@ set -euo pipefail

ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
CMD=(node "$ROOT/dist/bin/oracle-cli.js" --engine browser --wait --heartbeat 0 --timeout 900 --browser-input-timeout 120000)
FAST_MODEL="gpt-5.2"
FAST_MODEL="gpt-5.2-instant"

run_and_check_contains() {
local label="$1"
local expected="$2"
shift 2
local logfile
logfile="$(mktemp -t oracle-browser-smoke-log)"
if ! "$@" >"$logfile" 2>&1; then
echo "[browser-smoke-upload-only] ${label}: command failed"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
if ! grep -Fq -- "$expected" "$logfile"; then
echo "[browser-smoke-upload-only] ${label}: expected output missing: $expected"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
cat "$logfile"
rm -f "$logfile"
}

tmpfile="$(mktemp -t oracle-browser-smoke)"
echo "smoke-attachment" >"$tmpfile"

echo "[browser-smoke-upload-only] fast upload attachment (non-inline)"
"${CMD[@]}" --model "$FAST_MODEL" --prompt "Read the attached file and return exactly one markdown bullet '- upload: <content>' where <content> is the file text." --file "$tmpfile" --slug browser-smoke-upload --force
run_and_check_contains \
"fast upload attachment (non-inline)" \
"upload=smoke-attachment" \
"${CMD[@]}" --model "$FAST_MODEL" \
--prompt "Return exactly one line and nothing else: upload=smoke-attachment" \
--file "$tmpfile" --slug browser-smoke-upload --force

rm -f "$tmpfile"
92 changes: 78 additions & 14 deletions scripts/browser-smoke.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,89 @@ set -euo pipefail

ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
CMD=(node "$ROOT/dist/bin/oracle-cli.js" --engine browser --wait --heartbeat 0 --timeout 900 --browser-input-timeout 120000)
FAST_MODEL="${ORACLE_BROWSER_SMOKE_FAST_MODEL:-gpt-5.2}"
FAST_MODEL="${ORACLE_BROWSER_SMOKE_FAST_MODEL:-gpt-5.2-instant}"
PRO_MODEL="${ORACLE_BROWSER_SMOKE_PRO_MODEL:-gpt-5.4-pro}"
# FAST_MODEL is for quick browser-path health checks only.
# PRO_MODEL is kept separate for real Pro/reattach coverage.

assert_output_contains() {
local label="$1"
local logfile="$2"
shift 2
for needle in "$@"; do
if ! grep -Fq -- "$needle" "$logfile"; then
echo "[browser-smoke] ${label}: expected output missing: $needle"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
done
}

run_and_check_contains() {
local label="$1"
shift
local expectations=()
while [ "$#" -gt 0 ] && [ "$1" != "--" ]; do
expectations+=("$1")
shift
done
shift
local logfile
logfile="$(mktemp -t oracle-browser-smoke-log)"
if ! "$@" >"$logfile" 2>&1; then
echo "[browser-smoke] ${label}: command failed"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
assert_output_contains "$label" "$logfile" "${expectations[@]}"
cat "$logfile"
rm -f "$logfile"
}

tmpfile="$(mktemp -t oracle-browser-smoke)"
echo "smoke-attachment" >"$tmpfile"

echo "[browser-smoke] fast upload attachment (non-inline)"
"${CMD[@]}" --model "$FAST_MODEL" --browser-attachments always --prompt "Read the attached file and return exactly one markdown bullet '- upload: <content>' where <content> is the file text." --file "$tmpfile" --slug browser-smoke-upload --force

echo "[browser-smoke] fast simple"
"${CMD[@]}" --model "$FAST_MODEL" --prompt "Return exactly one markdown bullet: '- pro-ok'." --slug browser-smoke-pro --force

echo "[browser-smoke] fast with attachment preview (inline)"
"${CMD[@]}" --model "$FAST_MODEL" --browser-inline-files --prompt "Read the attached file and return exactly one markdown bullet '- file: <content>' where <content> is the file text." --file "$tmpfile" --slug browser-smoke-file --preview --force

echo "[browser-smoke] pro standard markdown check"
"${CMD[@]}" --model "$PRO_MODEL" --prompt "Return two markdown bullets and a fenced code block labeled js that logs 'thinking-ok'." --slug browser-smoke-thinking --force

echo "[browser-smoke] reattach flow after controller loss"
echo "[browser-smoke][fast] upload attachment (non-inline)"
run_and_check_contains \
"fast upload attachment (non-inline)" \
"upload=smoke-attachment" \
-- \
"${CMD[@]}" --model "$FAST_MODEL" --browser-attachments always \
--prompt "Return exactly one line and nothing else: upload=smoke-attachment" \
--file "$tmpfile" --slug browser-smoke-upload --force

echo "[browser-smoke][fast] simple"
run_and_check_contains \
"fast simple" \
"pro-ok" \
-- \
"${CMD[@]}" --model "$FAST_MODEL" \
--prompt "Return exactly one line and nothing else: pro-ok" \
--slug browser-smoke-pro --force

echo "[browser-smoke][fast] attachment preview (inline)"
run_and_check_contains \
"fast with attachment preview (inline)" \
"file=smoke-attachment" \
-- \
"${CMD[@]}" --model "$FAST_MODEL" --browser-inline-files \
--prompt "Return exactly one line and nothing else: file=smoke-attachment" \
--file "$tmpfile" --slug browser-smoke-file --preview --force

echo "[browser-smoke][pro] standard markdown check"
run_and_check_contains \
"pro standard markdown check" \
'```js' \
"console.log('thinking-ok')" \
'```' \
-- \
"${CMD[@]}" --model "$PRO_MODEL" \
--prompt $'Return exactly these three lines and nothing else:\n```js\nconsole.log('\''thinking-ok'\'')\n```' \
--slug browser-smoke-thinking --force

echo "[browser-smoke][pro] reattach flow after controller loss"
slug="browser-reattach-smoke"
meta="$HOME/.oracle/sessions/$slug/meta.json"
logfile="$(mktemp -t oracle-browser-reattach)"
Expand Down
Loading