You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenChrome has outcome contracts, plan execution, and hinting, but task boundaries are still spread across natural language instructions, tool arguments, and optional success checks. This makes it harder for agents to know when to stop, when a tool sequence is outside the intended action space, or whether repeated observation calls are wandering.
DSPy-style lesson to adopt: separate task interface from prompt/instruction wording. OpenChrome should add a deterministic, typed browser task signature that declares inputs, allowed tools, success contract, stop conditions, and loop guards.
Direction / fit
Tier: core for schema validation and deterministic evaluation; pilot only if later wired to policy/retry execution.
No LLM judgment, no DSPy dependency, no prompt optimizer.
Builds on existing src/contracts/** assertions rather than replacing them.
Should reduce wandering by making success and stop conditions explicit in machine-readable form.
Goal
Add a BrowserTaskSignature schema and validator that can be attached to compiled plans, workflow runs, or explicit tool calls to guide deterministic progress evaluation and response metadata.
Proposed implementation
Define a schema, e.g. src/contracts/task-signature.ts:
Add validator with batched errors, consistent with existing contract validator style.
Add deterministic evaluator helper that consumes:
task signature;
current EvalContext;
recent tool call summary;
elapsed time/tool count.
Evaluator returns structured status and performs a preflight check before executing a signature-bound compiled plan. If a planned step uses a tool outside allowedTools, the plan must not execute and must return failure with a disallowed-tool reason:
Run execute_plan or the implemented signature-aware workflow against http://localhost:9995/search.html.
# Exact tool payload may differ; PR must include the final payload used.# The result must expose taskSignature.status == "success".
cat /tmp/task-signature-success-response.json | jq -e '.result.content[0].text | fromjson | .taskSignature.status == "success"'
Pass: signature status becomes success once the DOM assertion is true.
Validation B — loop guard catches wandering
Run a fixture sequence that performs repeated read_page/screenshot observations without progress.
Do not infer signatures from natural language in the server.
Do not add new irreversible-action policy beyond existing hooks; this issue only enforces the declared allowedTools boundary for signature-bound execution.
Self-review checklist for implementer
Is each status deterministic from tool calls and contract evaluators?
Does this reduce ambiguity for agents without adding server-side AI decisions?
Are no-signature paths byte/shape-compatible with existing clients?
Are secret-marked inputs excluded from logs and reports?
These issues are intentionally scoped to OpenChrome-native deterministic harnessing. They must not introduce DSPy/Python runtime dependencies or server-side LLM decisions.
Curated scope, overlap handling, and verification checklist
Problem
OpenChrome has outcome contracts, plan execution, and hinting, but task boundaries are still spread across natural language instructions, tool arguments, and optional success checks. This makes it harder for agents to know when to stop, when a tool sequence is outside the intended action space, or whether repeated observation calls are wandering.
DSPy-style lesson to adopt: separate task interface from prompt/instruction wording. OpenChrome should add a deterministic, typed browser task signature that declares inputs, allowed tools, success contract, stop conditions, and loop guards.
Direction / fit
corefor schema validation and deterministic evaluation;pilotonly if later wired to policy/retry execution.src/contracts/**assertions rather than replacing them.Goal
Add a
BrowserTaskSignatureschema and validator that can be attached to compiled plans, workflow runs, or explicit tool calls to guide deterministic progress evaluation and response metadata.Proposed implementation
src/contracts/task-signature.ts:allowedTools, the plan must not execute and must returnfailurewith a disallowed-tool reason:execute_planmay accept an optional signature ID/object and include status in result metadata.workflow_statusmay surface signature progress if a workflow was initialized with one.Acceptance criteria
BrowserTaskSignatureschema is implemented and exported from a stable module.execute_planor workflow status is backward-compatible: no signature means no behavior change except absent metadata.Required OpenChrome real-validation after implementation
Use local fixtures and a built OpenChrome server.
Setup
Validation A — success boundary
Create/use a fixture signature equivalent to:
{ "version": 1, "id": "fixture.search.success", "description": "Search form reaches result state", "inputs": { "query": { "type": "string", "required": true, "redaction": "none" } }, "allowedTools": ["navigate", "find", "interact", "read_page"], "success": { "kind": "dom_text", "selector": "#result", "contains": "Searched: cats" }, "loopGuards": [{ "kind": "max_observation_calls", "limit": 2, "window": 4 }], "budgets": { "maxToolCalls": 8, "maxWallMs": 30000 } }Run
execute_planor the implemented signature-aware workflow againsthttp://localhost:9995/search.html.Pass: signature status becomes
successonce the DOM assertion is true.Validation B — loop guard catches wandering
Run a fixture sequence that performs repeated
read_page/screenshot observations without progress.Pass: loop guard returns a deterministic stop status before the global timeout.
Validation C — allowed tool boundary
Attempt to run a plan/signature that uses a tool outside
allowedTools, e.g.javascript_toolwhen not allowed.Pass: response identifies the disallowed tool and the disallowed step is not executed.
Validation D — backward compatibility
Run the same
execute_planor workflow without a task signature.Pass: existing behavior remains unchanged when no signature is supplied.
Cleanup
Non-goals
src/contracts/**.allowedToolsboundary for signature-bound execution.Self-review checklist for implementer
Related DSPy-inspired harness hardening set
These issues are intentionally scoped to OpenChrome-native deterministic harnessing. They must not introduce DSPy/Python runtime dependencies or server-side LLM decisions.
Curated scope, overlap handling, and verification checklist
Scope classification
BrowserTaskSignature, a typed description of allowed task inputs, action space, expected evidence, and stop conditions.feat/1049-task-signature). Amend that PR rather than duplicating work.Overlap and conflict resolution
Implementation checklist
BrowserTaskSignaturetypes/schema with explicit fields for task identity, allowed tools/actions, required evidence, success/failure boundaries, and optional progress requirements.Success criteria
Post-merge OpenChrome live verification checklist