Skip to content

Conversation

@seanmcguire12
Copy link
Member

@seanmcguire12 seanmcguire12 commented Dec 2, 2025

why

  • async functions invoked by act, extract, and observe all continued to run even after the timeout was reached

what changed

  • this PR introduces a time remaining check mechanism which runs between each major IO operation inside each of the handlers
  • this ensures that user defined timeout are actually respected inside of act, extract, and observe

test plan

  • added tests to confirm that internal async functions do not continue running after the timeout is reached

Summary by cubic

Fixes act, extract, and observe to truly honor the timeout parameter with step-wise guards that abort early and return clear errors. Deterministic actions now use the same guard path in v3.

  • Bug Fixes
    • Added createTimeoutGuard and specific ActTimeoutError, ExtractTimeoutError, and ObserveTimeoutError (exported).
    • Replaced Promise.race with per-step checks across snapshot capture, LLM inference, action execution, and self-heal retries.
    • Enforced per-step timeouts in ActHandler.takeDeterministicAction; metrics unchanged.
    • Wired v3 deterministic actions to pass a timeout guard; shadow DOM and unsupported actions behavior unchanged.

Written for commit d6bbfb8. Summary will update automatically on new commits.

@changeset-bot
Copy link

changeset-bot bot commented Dec 2, 2025

🦋 Changeset detected

Latest commit: d6bbfb8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@seanmcguire12 seanmcguire12 changed the base branch from main to seanmcguire/stg-1020-refactor-acthandlerts December 2, 2025 00:35
@seanmcguire12
Copy link
Member Author

@greptileai

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 2, 2025

Greptile Overview

Greptile Summary

Replaces Promise.race-based timeouts with step-wise timeout guards across act(), extract(), and observe() handlers to prevent downstream operations from continuing after timeout.

Key improvements:

  • Introduces createTimeoutGuard utility that returns a reusable checker function tracking elapsed time via Date.now()
  • Guards placed before each major async operation (snapshot capture, LLM inference, action execution)
  • Specific error types (ActTimeoutError, ExtractTimeoutError, ObserveTimeoutError) provide clear timeout context
  • Self-heal and two-step flows properly guarded at each checkpoint
  • Comprehensive test coverage validates timeout behavior at various stages
  • Error types properly exported in public API

The refactor eliminates the race condition where Promise.race would allow background operations to continue running after timeout, which could cause unexpected state changes or resource consumption.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Score reflects thorough implementation with excellent test coverage, clean refactoring that simplifies control flow, proper error handling at all async boundaries, and correct integration across all three handler types. The timeout guard implementation is straightforward and follows best practices.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
packages/core/lib/v3/handlers/handlerUtils/timeoutGuard.ts 5/5 New timeout guard utility using Date.now() for precise timeout tracking; returns no-op function when timeout is undefined or ≤0; clean and straightforward implementation
packages/core/lib/v3/types/public/sdkErrors.ts 5/5 Added TimeoutError base class and three specialized timeout errors (ActTimeoutError, ExtractTimeoutError, ObserveTimeoutError); all properly extend base classes with correct naming and message formatting
packages/core/lib/v3/handlers/actHandler.ts 4/5 Replaced Promise.race timeout with step-wise guards throughout act() flow including two-step and self-heal paths; properly catches and re-throws ActTimeoutError; timeout guard passed to takeDeterministicAction as optional parameter
packages/core/lib/v3/handlers/extractHandler.ts 5/5 Replaced Promise.race timeout with guards before snapshot and LLM inference; clean removal of nested doExtract function; timeout guards placed at all critical async operations
packages/core/lib/v3/handlers/observeHandler.ts 5/5 Replaced Promise.race timeout with guards before snapshot and LLM call; removed nested doObserve function for flatter control flow; guards prevent downstream processing after timeout

Sequence Diagram

sequenceDiagram
    participant User
    participant ActHandler
    participant TimeoutGuard
    participant Snapshot
    participant LLM
    participant Action
    
    User->>ActHandler: act(instruction, timeout=5000ms)
    ActHandler->>TimeoutGuard: createTimeoutGuard(5000ms)
    TimeoutGuard-->>ActHandler: ensureTimeRemaining()
    
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [check 1]
    Note over TimeoutGuard: elapsed < 5000ms ✓
    
    ActHandler->>ActHandler: waitForDomNetworkQuiet()
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [check 2]
    Note over TimeoutGuard: elapsed < 5000ms ✓
    
    ActHandler->>Snapshot: captureHybridSnapshot()
    Snapshot-->>ActHandler: combinedTree, xpathMap
    
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [check 3]
    Note over TimeoutGuard: elapsed < 5000ms ✓
    
    ActHandler->>LLM: getActionFromLLM()
    LLM-->>ActHandler: action
    
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [check 4]
    Note over TimeoutGuard: elapsed >= 5000ms ✗
    
    TimeoutGuard-->>ActHandler: throw ActTimeoutError
    ActHandler-->>User: ActTimeoutError("act() timed out after 5000ms")
    
    Note over User,Action: Alternative: Two-Step Flow
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [before step 2 snapshot]
    ActHandler->>Snapshot: captureHybridSnapshot() [step 2]
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [before step 2 LLM]
    ActHandler->>LLM: getActionFromLLM() [step 2]
    
    Note over User,Action: Alternative: Self-Heal Flow
    ActHandler->>Action: performUnderstudyMethod()
    Action-->>ActHandler: Error
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [before retry snapshot]
    ActHandler->>Snapshot: captureHybridSnapshot() [retry]
    ActHandler->>TimeoutGuard: ensureTimeRemaining() [before retry LLM]
    ActHandler->>LLM: getActionFromLLM() [retry]
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Comment on lines +198 to +209
v3Logger({
category: "extraction",
message: completed
? "Extraction completed successfully"
: "Extraction incomplete after processing all data",
level: 1,
auxiliary: {
prompt_tokens: { value: String(prompt_tokens), type: "string" },
completion_tokens: { value: String(completion_tokens), type: "string" },
inference_time_ms: {
value: String(inference_time_ms),
type: "string",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should move this to after injecting urls to prevent confusion on logs whenever users extract with urls

@miguelg719 miguelg719 marked this pull request as ready for review December 4, 2025 22:52
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 9 files

Prompt for AI agents (all 2 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/core/tests/timeout-handlers.test.ts">

<violation number="1" location="packages/core/tests/timeout-handlers.test.ts:181">
P2: `expect.fail()` is not available in Vitest&#39;s default `expect` API. If the expected error isn&#39;t thrown, this will fail with &#39;expect.fail is not a function&#39; rather than a clear test failure message. Use `throw new Error()` or a failing assertion instead.</violation>

<violation number="2" location="packages/core/tests/timeout-handlers.test.ts:238">
P3: Typo in comment: &#39;performUndertudy&#39; should be &#39;performUnderstudy&#39;.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

@seanmcguire12 seanmcguire12 force-pushed the seanmcguire/stg-1020-refactor-acthandlerts branch from 8d8e569 to 08d8454 Compare December 5, 2025 00:35
@seanmcguire12 seanmcguire12 force-pushed the seanmcguire/stg-1013-bug-not-aborting-promises-downstream branch from 83dbace to 6d4c8e3 Compare December 5, 2025 00:36
@seanmcguire12 seanmcguire12 changed the title fix: act/extract/observe not respecting timeout param fix: act, extract, and observe not respecting timeout param Dec 5, 2025
@seanmcguire12 seanmcguire12 mentioned this pull request Dec 5, 2025
8 tasks
miguelg719 pushed a commit that referenced this pull request Dec 5, 2025
# why
- to clean up the actHandler before #1330 




<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Refactors actHandler to centralize LLM action parsing and execution,
reduce duplication, and improve metrics reporting. Behavior stays the
same, with clearer naming and more reliable two-step and fallback flows.

## Why:
- Reduce duplicated LLM calls and normalization logic.
- Improve readability and maintainability.
- Ensure consistent metrics and variable substitution.
- Make the self-heal/fallback path more robust.

## What:
- Renamed actFromObserveResult to takeDeterministicAction and updated
all call sites (ActCache, AgentCache, v3).
- Added getActionFromLLM for inference, metrics, normalization, and
variable substitution.
- Added recordActMetrics to centralize ACT metrics reporting.
- Extracted normalizeActInferenceElement and
substituteVariablesInArguments helpers.
- Simplified two-step act flow and fallback retry using shared helpers.
- Kept existing behavior (selector normalization, variable substitution,
retries).

## Test Plan:
- [ ] Run unit tests for actHandler to confirm no regressions.
- [ ] Verify single-step actions execute as before.
- [ ] Verify two-step flow triggers when LLM returns twoStep and
executes the second action.
- [ ] Confirm fallback self-heal path updates selector and retries
successfully.
- [ ] Check metrics are recorded once per inference call in both steps
and fallback.
- [ ] Validate variable substitution replaces %key% tokens in action
arguments.
- [ ] Exercise AgentCache and ActCache paths to ensure
takeDeterministicAction works end-to-end.
- [ ] Build passes and type checks for all renamed method references.

<sup>Written for commit 08d8454.
Summary will update automatically on new commits.</sup>

<!-- End of auto-generated description by cubic. -->
@seanmcguire12 seanmcguire12 changed the base branch from seanmcguire/stg-1020-refactor-acthandlerts to main December 5, 2025 03:39
@seanmcguire12 seanmcguire12 force-pushed the seanmcguire/stg-1013-bug-not-aborting-promises-downstream branch from c2541f3 to d6bbfb8 Compare December 5, 2025 03:41
@seanmcguire12 seanmcguire12 merged commit d382084 into main Dec 5, 2025
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants