Skip to content

rokosbasilisk/weak-to-strong-expts

This branch is 13 commits ahead of, 8 commits behind openai/weak-to-strong:main.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

7faf985 · Apr 14, 2024

History

18 Commits
Apr 14, 2024
Dec 14, 2023
Jan 3, 2024
Jan 3, 2024
Apr 14, 2024
Dec 14, 2023
Dec 16, 2023
Dec 16, 2023
Jan 3, 2024
Jan 3, 2024
Apr 14, 2024
Apr 14, 2024
Dec 14, 2023

Repository files navigation

Weak-to-strong generalization

Our setup and how it relates to superhuman AI alignment

paper on weak-to-strong generalization.

openai's implementation of weak-to-strong learning setup for binary classification tasks.

Running the Script

The main script of the project is train_weak_to_strong.py. It can be run from the command line using the following command:

python train_weak_to_strong.py

The script accepts several command-line arguments to customize the training process. Here are some examples:

python train_weak_to_strong.py --batch_size 32 --max_ctx 512 --ds_name "sciq" --loss "logconf" --n_docs 1000 --n_test_docs 100 --weak_model_size "gpt2-medium" --strong_model_size "gpt2-large" --seed 42

experiments in weak-2-strong generalization

  • add pythia-configs
  • try some chessy experiment
  • plot scaling laws

About

experiments with weak-to-strong

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 90.8%
  • Python 9.2%