Skip to content

[Task Submission] Frequency based mathematics (frequency_based_mathematics)#1

Open
kazemnejad wants to merge 3 commits intoGenBench:mainfrom
kazemnejad:sample_submission
Open

[Task Submission] Frequency based mathematics (frequency_based_mathematics)#1
kazemnejad wants to merge 3 commits intoGenBench:mainfrom
kazemnejad:sample_submission

Conversation

@kazemnejad
Copy link
Collaborator

Frequency based mathematics (example task)

This task quantifies generalisation by comparing accuracies with pretraining term frequencies.

Copied from last year's submission sample.

Authors

  • Dieuwke Hupkes dieuwkehupkes@gmail.com

Implementation

This task reimplements the evaluation function.

Usage

# Load the task
task = load_task("frequency_based_mathematics")
ds = task.get_prepared_datasets(
    PreparationStrategy.PROMPT_BASED_TESTING,
    shot_list=[0])[0]

# Load your pretraining frequencies and model predictions
pretraining_freqs = ... 
preds = ... 

for pred_type, preds in preds.items():
    for freq_type, pretraining_freq in pretraining_freqs.items():
        scores = task.evaluate_predictions(
            predictions=preds,
            gold=ds,
            term_freqs=pretraining_freq
        )
 
print(f'Scores: {scores}')

Checklist:

  • I and my co-authors agree that, if this PR is merged, the code will be available under the same license as the genbench_cbt repository.
  • Prior to submitting, I have ran the GenBench CBT test suite using the genbench-cli test-task tool.
  • I have read the description of what should be in the doc.md of my task, and have added the required arguments.
  • I have submitted or will submit an accompanying paper to the GenBench workshop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant