Skip to content

Conversation

@pirate
Copy link
Member

@pirate pirate commented Dec 4, 2025

Stack Position

main
   └── pr/1-event-bus-infrastructure
         └── pr/2-api-schemas-openapi
               └── pr/3-session-store-p2p-server
                     └── pr/4-test-infrastructure  ← THIS PR

What

This moves the fastify server implementation into it's own package.

There's also a few cleanup lines to remove disableAPI from old test files and stuff.

## What
Adds a comprehensive test harness for running integration tests against both
local P2P servers and the remote Stagehand cloud API.

## Why
Enables testing the full request flow through the HTTP API layer, ensuring
the P2P server correctly handles session management, streaming, and all
Stagehand operations (act, extract, observe, agentExecute, navigate).

## How

### Global Test Setup
- `global-setup.stagehand.ts`: Vitest global setup that starts a local
  Stagehand server before tests and tears it down after
- Configurable via STAGEHAND_TEST_TARGET (local/remote) env var
- Injects baseUrl into test context via Vitest's provide() mechanism

### Integration Tests
- `integration/integration.test.ts`: Full integration test suite covering:
  - Session creation and termination
  - Navigation via /navigate endpoint
  - Data extraction via /extract endpoint
  - Action observation and replay via /observe and /act
  - Agent execution via /agentExecute
  - Metrics retrieval via /replay
- `integration/support/stagehandClient.ts`: Test harness factory for creating
  Stagehand instances configured for the active test target

### Test Configuration
- `vitest.config.ts`: Updated to load env from tests/ dir and use global setup
- `vitest.provided-context.d.ts`: TypeScript types for injected test context
- Removed `disableAPI` references from existing tests (no longer needed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@changeset-bot
Copy link

changeset-bot bot commented Dec 4, 2025

⚠️ No Changeset found

Latest commit: 09a20df

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pirate pirate changed the title test: add integration test infrastructure for P2P server [CANONICALIZATION 4/4] test: add integration test infrastructure for P2P server Dec 4, 2025
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 4, 2025

Greptile Overview

Greptile Summary

Adds comprehensive integration test infrastructure for P2P server testing with support for both local and remote test targets.

Key Changes

  • Global setup (global-setup.stagehand.ts) manages local P2P server lifecycle during test runs
  • Integration test suite (integration/integration.test.ts) covers session management, navigation, extraction, observation, action replay, and agent execution
  • Test harness factory (integration/support/stagehandClient.ts) configures Stagehand clients for different test targets
  • Vitest configuration updated with environment loading and global setup hooks
  • Removed deprecated disableAPI option from existing test files

Critical Issues Found

  • stagehandClient.ts: Contains two critical bugs that will cause runtime failures:
    1. Invalid model name openai/gpt-5-mini (should be openai/gpt-4o-mini)
    2. Missing return statement for remote target - function returns undefined for remote tests
  • These bugs will prevent remote target tests from running successfully

Confidence Score: 1/5

  • This PR has critical bugs that will cause runtime failures when tests are executed
  • Score reflects two critical logic errors in stagehandClient.ts: invalid model name and missing return statement for remote target. These will cause immediate runtime errors when integration tests run, especially for remote targets. While the overall architecture is sound, these bugs must be fixed before merging.
  • Critical attention needed for packages/core/tests/integration/support/stagehandClient.ts due to runtime-breaking bugs

Important Files Changed

File Analysis

Filename Score Overview
packages/core/tests/integration/support/stagehandClient.ts 1/5 Critical bugs: invalid model name gpt-5-mini and missing return for remote target causing runtime errors
packages/core/tests/integration/integration.test.ts 2/5 Integration test suite with runtime errors due to stagehandClient bugs, plus minor indentation inconsistency
packages/core/tests/global-setup.stagehand.ts 5/5 Vitest global setup for managing local/remote test servers, properly handles teardown

Sequence Diagram

sequenceDiagram
    participant Test as Integration Test
    participant Setup as Global Setup
    participant Client as Test Harness
    participant Server as P2P Server
    participant Stagehand as Stagehand Instance
    participant Browser as Browser Session

    Note over Setup,Server: Test Initialization (beforeAll)
    Setup->>Setup: Check STAGEHAND_TEST_TARGET env
    alt target === "local"
        Setup->>Stagehand: new Stagehand({env: "LOCAL"})
        Setup->>Stagehand: init()
        Setup->>Server: createServer({host, port})
        Setup->>Server: listen()
        Setup->>Setup: provide("STAGEHAND_BASE_URL", localUrl)
    else target === "remote"
        Setup->>Setup: provide("STAGEHAND_BASE_URL", remoteUrl)
    end

    Note over Test,Browser: Test Execution
    Test->>Client: createStagehandHarness(target)
    Client->>Client: inject("STAGEHAND_BASE_URL")
    Client->>Client: set process.env.STAGEHAND_API_URL
    Client->>Stagehand: new Stagehand({env: "BROWSERBASE"})
    Test->>Stagehand: init()
    Stagehand->>Server: POST /v1/start
    Server->>Browser: Create session
    Browser-->>Server: Session ID
    Server-->>Stagehand: Session details
    Stagehand-->>Test: Ready

    Note over Test,Browser: Test Operations
    Test->>Stagehand: page.goto(url)
    Stagehand->>Server: POST /v1/navigate
    Server->>Browser: Navigate
    Browser-->>Server: Response
    Server-->>Stagehand: Navigation result
    Stagehand-->>Test: Success

    Test->>Stagehand: extract(prompt, schema)
    Stagehand->>Server: POST /v1/extract
    Server->>Browser: Get page content
    Server->>Server: LLM processing
    Server-->>Stagehand: Extracted data
    Stagehand-->>Test: Structured result

    Test->>Stagehand: observe(instruction)
    Stagehand->>Server: POST /v1/observe
    Server->>Browser: Find elements
    Server->>Server: LLM analysis
    Server-->>Stagehand: Actions
    Stagehand-->>Test: Action list

    Test->>Stagehand: act(action)
    Stagehand->>Server: POST /v1/act
    Server->>Browser: Execute action
    Browser-->>Server: Result
    Server-->>Stagehand: Execution result
    Stagehand-->>Test: Success

    Note over Test,Browser: Test Cleanup
    Test->>Stagehand: close()
    Stagehand->>Server: POST /v1/end
    Server->>Browser: Terminate session
    Browser-->>Server: Closed
    Server-->>Stagehand: Cleanup complete

    Note over Setup,Server: Global Teardown
    Setup->>Server: server.close()
    Setup->>Stagehand: stagehand.close()
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 9 files

Prompt for AI agents (all 1 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/core/tests/integration/support/stagehandClient.ts">

<violation number="1" location="packages/core/tests/integration/support/stagehandClient.ts:47">
P1: createStagehandHarness only returns when the target is local, so remote test runs receive `undefined` and immediately crash. Add logic that instantiates and returns a Stagehand client for the remote target instead of exiting this function without a value.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

@monadoid monadoid changed the title [CANONICALIZATION 4/4] test: add integration test infrastructure for P2P server [CANONICALIZATION 4/4] Cleanup - disableAPI fields Dec 4, 2025
@miguelg719 miguelg719 changed the title [CANONICALIZATION 4/4] Cleanup - disableAPI fields [CANONICALIZATION 4/4] Cleanup - disableAPI fields (Resolves STG-1047) Dec 4, 2025
- added pnpm dev command to start server
- moved shared types back into core
- setup turborepo compatibility
@monadoid monadoid changed the title [CANONICALIZATION 4/4] Cleanup - disableAPI fields (Resolves STG-1047) [CANONICALIZATION 4/4] Move server into it's own package (Resolves STG-1047) Dec 5, 2025
@pirate pirate closed this Dec 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants