Skip to content

Conversation

@ceferisbarov
Copy link
Contributor

Summary

I have added GSM8K benchmark.

What are you adding?

  • Bug fix (non-breaking change which fixes an issue)
  • New benchmark/evaluation
  • New model provider
  • CLI enhancement
  • Performance improvement
  • Documentation update
  • API/SDK feature
  • Integration (CI/CD, tools)
  • Export/import functionality
  • Code refactoring
  • Breaking change
  • Other

Changes Made

  • New evaluation file src/openbench/evals/gsm8k.py
  • New dataset loader src/openbench/datasets/gsm8k.py
  • Registered the benchmark metadata in src/openbench/config.py
  • Imported the task in src/openbench/_registry.py
  • Mentioned the new benchmark in README

Testing

  • I have run the existing test suite (pytest)
  • I have added tests for my changes
  • I have tested with multiple model providers (if applicable)
  • I have run pre-commit hooks (pre-commit run --all-files)

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (if applicable)
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Related Issues

Closes #143

Additional Context

@ceferisbarov ceferisbarov marked this pull request as ready for review September 3, 2025 04:02
@ceferisbarov ceferisbarov force-pushed the feat/add-gsm8k-benchmark branch from 465c3e0 to 1dab379 Compare September 5, 2025 17:45
@ceferisbarov ceferisbarov force-pushed the feat/add-gsm8k-benchmark branch from 1dab379 to aa395a6 Compare September 5, 2025 17:46
@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2025

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 2, 2025
@AarushSah AarushSah requested a review from nmayorga7 as a code owner November 5, 2025 23:40
@github-actions github-actions bot removed the Stale label Nov 6, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 7, 2025

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

[Feature]: Add GSM8K benchmark

3 participants