Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stagehand manual patch and partial eval datapoints #57

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dinmukhamedm
Copy link
Member

@dinmukhamedm dinmukhamedm commented Mar 17, 2025

Important

Refactor to use LaminarClient for API interactions, add Stagehand manual instrumentation, and update version and lint workflow.

  • Client Refactor:
    • Introduces LaminarClient in src/client.ts for API interactions, handling pipeline runs, semantic search, evaluation initialization, and datapoint management.
    • Refactors Laminar class in laminar.ts to use LaminarClient for API calls.
    • Updates Evaluation class in evaluations.ts to use LaminarClient for evaluation-related API calls.
  • Instrumentation:
    • Adds StagehandInstrumentation in stagehand.ts for manual instrumentation of Stagehand methods init and close.
    • Updates sdk/tracing/index.ts to support manual instrumentation of Stagehand.
  • Miscellaneous:
    • Updates version in package.json from 0.4.42 to 0.4.43.
    • Modifies .github/workflows/check-lint.yml to comment out non-working steps and enable the workflow.

This description was created by Ellipsis for 5150d0a. It will automatically update as commits are pushed.

@dinmukhamedm dinmukhamedm marked this pull request as ready for review March 18, 2025 08:05
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 5150d0a in 1 minute and 57 seconds

More details
  • Looked at 840 lines of code in 11 files
  • Skipped 0 files when reviewing.
  • Skipped posting 12 drafted comments based on config settings.
1. src/browser/stagehand.ts:180
  • Draft comment:
    Check that 'this.stagehandPage' exists before using it in patchStagehandInit. It seems unclear where 'stagehandPage' is assigned.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. src/client.ts:17
  • Draft comment:
    LaminarClient.initialize method correctly sets API key and baseUrl; ensure that all consumers call it before any client methods are used.
  • Reason this comment was not posted:
    Confidence changes required: 30% <= threshold 50%
    None
3. src/evaluations.ts:369
  • Draft comment:
    The use of newUUID for datapoint 'id' is now added; ensure that the new UUID function returns a valid StringUUID consistent with the rest of the system.
  • Reason this comment was not posted:
    Confidence changes required: 40% <= threshold 50%
    None
4. src/laminar.ts:143
  • Draft comment:
    The baseUrl is sanitized by removing trailing slashes; ensure that this behavior does not unintentionally modify expected API URLs.
  • Reason this comment was not posted:
    Confidence changes required: 30% <= threshold 50%
    None
5. src/sdk/interfaces/initialize-options.interface.ts:112
  • Draft comment:
    Stagehand is now added in instrumentModules; confirm that its usage conforms to StagehandLib's API and integration in the instrumentation system.
  • Reason this comment was not posted:
    Confidence changes required: 40% <= threshold 50%
    None
6. src/types.ts:41
  • Draft comment:
    The EvaluationDatapoint type update makes 'id' mandatory and 'scores' optional; verify that downstream consumers handle these changes correctly.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
7. .github/workflows/check-lint.yml:13
  • Draft comment:
    The commented-out steps (lines 16-19) indicate experimental code that isn’t running. Consider either removing these comments if not needed or adding a more detailed note on when/how they should be re-enabled.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
    The code already has an explanatory comment about why the steps are commented out. The automated comment is essentially asking for what's already there - an explanation of why the code is commented. The comment doesn't point out a specific issue that needs fixing, it's just suggesting to either remove or document something that's already documented.
    The comment could be highlighting a legitimate concern about leaving partially-commented code in a workflow file, which might confuse future maintainers.
    While that's true, the existing inline comment already addresses this by explaining why the steps are commented out. The automated comment doesn't add any actionable value beyond what's already there.
    The comment should be deleted as it's not providing additional actionable value beyond the existing inline documentation, and it's essentially asking for documentation that already exists.
8. src/browser/stagehand.ts:181
  • Draft comment:
    The call to patchStagehandPage uses 'this.stagehandPage', but no such property is defined. It likely should use the existing page (e.g. 'this.page').
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. src/sdk/tracing/index.ts:337
  • Draft comment:
    The function otelSpanIdToUUID is used but not imported. Add the appropriate import (e.g. from '../../utils') to avoid runtime errors.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. src/evaluations.ts:254
  • Draft comment:
    There is a minor typographical issue in the error message for invalid evaluator keys. The string concatenation results in "Invalid evaluator key: "${key}".Keys must only contain letters, digits, hyphens, underscores, or spaces.". A space should be added after the period to read as "...${key}". Keys must only contain...".
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. src/laminar.ts:68
  • Draft comment:
    Typographical error: Please change 'overriden' to 'overridden' in the documentation comment for the 'env' parameter.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. src/sdk/interfaces/initialize-options.interface.ts:44
  • Draft comment:
    Typo found in the comment describing 'disableBatch': 'developement' should be 'development'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_dJLqs6zaJwcITaxu


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant