Skip to content

Conversation

kauabh
Copy link

@kauabh kauabh commented Aug 25, 2025

CHRF Score Metric Using sacrebleu

Issue Link / Problem Description

This PR introduces a new ChrfScore metric based on sacrebleu.corpus_chrf for evaluating the similarity between a generated response and a reference.
CHRF is better suited for morphologically rich languages and provides a character-level F-score.

Changes Made

Added ChrfScore class to implement character F-score (CHRF) metric using sacrebleu.corpus_chrf.

Testing

How to Test

  • Manual testing steps:
    1. Install sacrebleu if not already installed: pip install sacrebleu
    2. Import and instantiate ChrfScore
    3. Pass a SingleTurnSample object with reference and response
    4. Run the metric and verify output is a float between 0.0 and 1.0

References

  • sacrebleu documentation:

Screenshots/Examples (if applicable)

from sacrebleu import corpus_chrf

hypotheses = ["The cat is on the mat."]
references = [["The cat is sitting on the mat."]]

score = corpus_chrf(hypotheses, references).score / 100
print(score)  # e.g., 0.67

kauabh added 4 commits August 25, 2025 12:21
Added _chrf_score
Added ChrfScore
Added CHRF docs
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant