You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Bytebot analysis highlighted a simple but important long-running-task discipline: for repetitive work, track total items, completed items, failed items, cursor, and explicit stop conditions. OpenChrome already detects low-level stalling via ProgressTracker and workflow stale updates, but it can still let an LLM prematurely declare success after processing only a subset of a requested list. This issue adds an opt-in bulk progress contract that reduces wandering and premature completion for crawl, pagination, multi-site, and workflow tasks.
What
Add a contract object and helper APIs that TaskRun/workflow/batch tools can use to track repetitive progress:
expected item count or unknown-count cursor
completed item ids
failed item ids and reasons
stop condition
minimum completion threshold
finalization guard that rejects completion if criteria are unmet
This is a generic progress guard only; it must not add scheduling, browser control, worker orchestration, or another workflow engine.
Proposed contract
exportinterfaceBulkProgressContract{contract_id: string;run_id?: string;scope: 'task_run'|'workflow'|'batch'|'crawl';expected_total?: number;min_completed?: number;stop_condition: string;// e.g. "no next page", "processed all input urls"item_key: string;// e.g. "url", "row_id", "profile_id"cursor?: string;completed: string[];failed: Array<{item: string;reason: string;retryable?: boolean}>;last_progress_at: number;created_at: number;updated_at: number;}exportinterfaceCompletionGuardResult{allowed: boolean;reason?: string;missing_count?: number;failed_count?: number;suggested_next_action?: string;}
Implementation notes
Store the contract as part of TaskRun metadata when run_id exists, or as a standalone JSON file under ~/.openchrome/progress-contracts/.
Provide internal helper functions for recordCompleted, recordFailed, updateCursor, and checkCompletionGuard.
Wire initially into TaskRun completion and one representative long-running tool path (crawl or workflow_collect) to keep scope bounded.
HintEngine consumes CompletionGuardResult from TaskRun completion attempts and emits a warning when completion is attempted too early.
Acceptance criteria
BulkProgressContract type and storage/helper module implemented with unit tests.
Completion guard blocks completion when expected_total is known and completed.length + failed.length < expected_total.
Completion guard blocks completion when min_completed is unmet.
Unknown-total mode allows completion only when a non-empty stop_condition is explicitly marked satisfied.
Failed items are retained in the final result and do not count as completed.
Bounded storage: completed/failed arrays are capped with truncation metadata for very large runs.
TaskRun integration: oc_task_run_complete returns a typed guard failure if the active bulk contract is incomplete, unless force:true and a reason are supplied.
HintEngine integration: an attempted premature completion produces a warning-level hint with the missing count and suggested next action.
Regression: normal short tasks without a bulk contract are unaffected.
npm run build && npm test green.
Real verification after merge using OpenChrome
Start a TaskRun with goal Visit three URLs and collect their titles and a bulk contract with expected_total: 3, item_key: 'url', and the three URLs: https://example.com, https://news.ycombinator.com/, https://www.iana.org/domains/reserved.
Visit only the first URL using navigate and read_page.
Record one completed item through oc_task_run_update or the new progress helper.
Attempt oc_task_run_complete.
Verify completion is rejected with missing_count: 2 and a suggested next action.
Visit the remaining two URLs, record both completed items, and call oc_task_run_complete again.
Verify completion succeeds and the final result includes all three completed item ids.
Repeat with one URL deliberately failed; verify completion can succeed only when the failed item is recorded with reason and the success criteria allow partial completion or force reason is supplied.
Confirm oc_task_run_get after restart still shows completed/failed items and cursor.
Out of scope
Automatic discovery of all items on arbitrary pages.
LLM-based evaluation of whether an item is complete.
Changing default behavior for tasks that do not opt into a bulk contract.
Can be implemented standalone for workflow/crawl if TaskRun is not yet merged.
Success definition
Merge is successful when OpenChrome can prevent an opt-in repetitive task from being marked complete before its declared item-level progress criteria are satisfied, with machine-readable recovery guidance for the LLM.
Curated scope, overlap handling, and verification checklist
Primary deliverable: a bulk progress contract that tracks total/completed/failed/skipped/cursor state and prevents premature completion when required items remain.
Define an opt-in bulk progress contract with expected total or item list, current cursor, completed count, failed/skipped count, and explicit completion condition.
Integrate with TaskRun/outcome-contract completion so success cannot be reported while required items remain incomplete.
Provide clear diagnostics showing which counts/items prevent completion.
Add tests for all-complete, partial-complete blocked, failed-item policy, cursor resume, and malformed contract input.
Keep output compact and avoid dumping full item lists unless explicitly requested or necessary for diagnostics.
Success criteria
A task with expected_total=3 cannot pass after processing only 1 or 2 required items.
A fully completed task can pass with concise evidence of counts and stop condition.
Failure/skipped policy is explicit and tested rather than inferred from prompt wording.
Existing non-bulk tasks continue to behave as before unless they opt into the contract.
Post-merge OpenChrome live verification checklist
Run a local fixture task with expected_total=3 and intentionally complete only one item; verify completion is blocked with a clear diagnostic.
Complete all three items and verify the task can finish with counts recorded.
Restart/resume from a mid-list cursor if supported by the implementation and verify counts remain consistent.
Record the command/tool calls, blocked diagnostic, final success evidence, and any TaskRun artifact path in merge verification notes.
Why
The Bytebot analysis highlighted a simple but important long-running-task discipline: for repetitive work, track total items, completed items, failed items, cursor, and explicit stop conditions. OpenChrome already detects low-level stalling via ProgressTracker and workflow stale updates, but it can still let an LLM prematurely declare success after processing only a subset of a requested list. This issue adds an opt-in bulk progress contract that reduces wandering and premature completion for crawl, pagination, multi-site, and workflow tasks.
What
Add a contract object and helper APIs that TaskRun/workflow/batch tools can use to track repetitive progress:
This is a generic progress guard only; it must not add scheduling, browser control, worker orchestration, or another workflow engine.
Proposed contract
Implementation notes
run_idexists, or as a standalone JSON file under~/.openchrome/progress-contracts/.recordCompleted,recordFailed,updateCursor, andcheckCompletionGuard.crawlorworkflow_collect) to keep scope bounded.CompletionGuardResultfrom TaskRun completion attempts and emits a warning when completion is attempted too early.Acceptance criteria
BulkProgressContracttype and storage/helper module implemented with unit tests.expected_totalis known andcompleted.length + failed.length < expected_total.min_completedis unmet.stop_conditionis explicitly marked satisfied.oc_task_run_completereturns a typed guard failure if the active bulk contract is incomplete, unlessforce:trueand a reason are supplied.npm run build && npm testgreen.Real verification after merge using OpenChrome
Visit three URLs and collect their titlesand a bulk contract withexpected_total: 3,item_key: 'url', and the three URLs:https://example.com,https://news.ycombinator.com/,https://www.iana.org/domains/reserved.navigateandread_page.oc_task_run_updateor the new progress helper.oc_task_run_complete.missing_count: 2and a suggested next action.oc_task_run_completeagain.forcereason is supplied.oc_task_run_getafter restart still shows completed/failed items and cursor.Out of scope
Dependencies
Success definition
Merge is successful when OpenChrome can prevent an opt-in repetitive task from being marked complete before its declared item-level progress criteria are satisfied, with machine-readable recovery guidance for the LLM.
Curated scope, overlap handling, and verification checklist
Scope classification
feat/1041-bulk-progress-contract). Amend that PR first.Overlap and conflict resolution
BrowserTaskSignaturemay describe bulk progress requirements, but this issue implements the runtime progress guard.Implementation checklist
Success criteria
Post-merge OpenChrome live verification checklist
expected_total=3and intentionally complete only one item; verify completion is blocked with a clear diagnostic.