Skip to content

Round 2 rename pass: eliminate remaining Benchmark/Test/Evaluation vocabulary from domain layer#139

Draft
Copilot wants to merge 6 commits intomainfrom
copilot/rename-benchmark-to-eval
Draft

Round 2 rename pass: eliminate remaining Benchmark/Test/Evaluation vocabulary from domain layer#139
Copilot wants to merge 6 commits intomainfrom
copilot/rename-benchmark-to-eval

Conversation

Copy link
Contributor

Copilot AI commented Mar 17, 2026

Follows the initial BenchmarkSpec→EvalSpec / TestCase→TaskSpec rename with a second pass covering all remaining legacy identifiers. All JSON wire tags are preserved; only Go identifiers change.

Renames

Old New
GraderKind / GraderKindX GraderType / GraderTypeX
GraderConfig.Kind GraderConfig.Type (json tag "kind""type")
BenchmarkConfig / NewBenchmarkConfig RunConfig / NewRunConfig
EvaluationOutcome EvalOutcome
EvalOutcome.BenchName EvalOutcome.EvalName
TestRunner / NewTestRunner EvalRunner / NewEvalRunner
RunBenchmark RunEval
TestOutcome / TestOutcomes TaskOutcome / TaskOutcomes
TestStats TaskStats
TestExpectation TaskExpectation
OutcomeSetup / OutcomeDigest EvalSetup / EvalDigest
MeasurementDef Metric
RunResult.Validations / AllValidationsPassed RunResult.GraderScores / AllGradersPassed
EventBenchmarkStart/Complete/Stopped EventEvalStart/Complete/Stopped
EventTestStart/Complete/Cached EventTaskStart/Complete/Cached

GraderConfig.Type json tag is the one structural exception — updated from "kind" to "type" to align with the existing yaml tag. All other JSON output field names are unchanged to preserve the results schema contract.

Copilot AI and others added 6 commits March 18, 2026 13:48
- BenchmarkSpec → EvalSpec (+ LoadBenchmarkSpec → LoadEvalSpec)
- type Config → type EvalConfig in internal/models/spec.go
  (field name .Config on EvalSpec and BenchmarkConfig in config package unchanged)
- TestCase → TaskSpec (+ LoadTestCase → LoadTaskSpec)
- TestStimulus → TaskInputs
- ValidatorInline → Grader; field Validators []ValidatorInline → Graders []Grader;
  json tag updated from validators,omitempty to graders,omitempty
- ValidatorInline.Kind → Grader.Type (GraderConfig.Kind unchanged)
- CreateTestCaseFromCopilotLog → CreateTaskSpecFromCopilotLog (+ Options type)

All 41 packages build and test cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Outcome, RunConfig, etc.

Co-authored-by: richardpark-msft <51494936+richardpark-msft@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer force-pushed the copilot/rename-benchmark-to-eval branch from 44f9d3f to c1c8f75 Compare March 18, 2026 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants