-
Notifications
You must be signed in to change notification settings - Fork 10
[bugfix] Fix mbpp evaluater class missing logger #115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -225,7 +225,7 @@ def __init__(self, metric: str = 'MBPP') -> None: | |||||||
| DSET_CODES.INVALID_MBPP_METRIC, | ||||||||
| f"MBPP evaluator metric must be 'MBPP' or 'MBPPPlus', got '{self.metric}'" | ||||||||
| ) | ||||||||
| super.__init__() | ||||||||
| super().__init__() | ||||||||
|
|
||||||||
| def score(self, predictions, references): | ||||||||
| if len(predictions) != len(references): | ||||||||
|
|
@@ -397,13 +397,13 @@ def _execution(programs, timeout): | |||||||
| exec(programs, exec_globals) | ||||||||
| key.append('pass') | ||||||||
| except TimeOutException: | ||||||||
| logger.debug(f"Program execution timeout for index {index}") | ||||||||
| logger.debug(f"Program execution timeout for task_id {task_id}") | ||||||||
| key.append('timeout') | ||||||||
| except AssertionError as e: | ||||||||
| logger.debug(f"Program assertion failed for index {index}: {e}") | ||||||||
| logger.debug(f"Program assertion failed for task_id {task_id}: {e}") | ||||||||
| key.append('wrong_answer') | ||||||||
| except BaseException as e: | ||||||||
| logger.debug(f"Program execution failed for index {index}: {e}") | ||||||||
| logger.debug(f"Program execution failed for task_id {task_id}: {e}") | ||||||||
| key.append('failed') | ||||||||
|
|
||||||||
| manager = multiprocessing.Manager() | ||||||||
|
|
@@ -428,10 +428,11 @@ class MBPPPassKEvaluator(MBPPEvaluator): | |||||||
| k(Tuple[int]): Choices of Pass@k. Defaults to (1, 10, 100) | ||||||||
|
||||||||
| k(Tuple[int]): Choices of Pass@k. Defaults to (1, 10, 100) | |
| k(Tuple[int]): Choices of Pass@k. Defaults to (1, 10, 100). | |
| metric (str): Name of the evaluation metric. Defaults to 'MBPP'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new metric parameter lacks test coverage. Consider adding test cases to verify that the metric parameter is correctly passed to the parent MBPPEvaluator class and that both 'MBPP' and 'MBPPPlus' values work correctly with MBPPPassKEvaluator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workflow configuration change appears unrelated to the bugfix described in the PR title and description. The PR is specifically about fixing syntax errors in the MBPPPass@kEvaluator class, but this change modifies CI/CD runner configuration. Consider removing this change or creating a separate PR for infrastructure updates.