Skip to content

add bench helper for skill evals #267

@sentry-junior

Description

@sentry-junior

we need a bench helper that can evaluate a skill against a structured set of eval fixtures and compare another variant of the skill against the same set.

  • support skill-scoped eval fixtures in a structured mechanism
  • likely support positive and negative finding directories per skill
  • enable testing and comparison of a skill variant against the eval set

Action taken on behalf of David Cramer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions