Skip to content

Activity

. revert wmdp4, and smaller n for a better plot

filyppushed 1 commit to main • 37b77b8…7797686 • 
2 days ago

narrow down unlearning_rate ranges, and standardize a bit

filyppushed 1 commit to main • b2a7a25…37b77b8 • 
3 days ago

for wmdp, tune retaining_rate

filyppushed 1 commit to main • 1c3f991…b2a7a25 • 
3 days ago

wmdp4 -> wmdp5

filyppushed 1 commit to main • 13372f9…1c3f991 • 
3 days ago

redo plots for the wmdp with controlled MMLU and both temp 1

filyppushed 1 commit to main • aa33636…13372f9 • 
5 days ago

. fix wmdp wandb name

filyppushed 1 commit to main • a549a81…aa33636 • 
9 days ago

wider wdmp unl rate search

filyppushed 1 commit to main • 7e9d420…a549a81 • 
9 days ago

use temp=1, create new baseline, allow 2% drop in accuracy due to som…

filyppushed 1 commit to main • 93ef0a8…7e9d420 • 
9 days ago

use temp=1, to reduce noise

filyppushed 1 commit to main • fe811da…93ef0a8 • 
9 days ago

. add some info

filyppushed 1 commit to main • 3297d5f…fe811da • 
9 days ago

use full wmdp dataset

filyppushed 2 commits to main • b89399e…3297d5f • 
9 days ago

add mmlu eval

filyppushed 2 commits to main • 42bdadb…b89399e • 
9 days ago

. fix loop definition

filyppushed 1 commit to main • c229c3b…42bdadb • 
16 days ago

simplify loop definition slightly

filyppushed 1 commit to main • 7f406c4…c229c3b • 
16 days ago

reproducing instructions

filyppushed 1 commit to main • 513eee0…7f406c4 • 
on Feb 17

instructions to reproduce

filyppushed 1 commit to main • 69c14e5…513eee0 • 
on Feb 17

Create LICENSE

filyppushed 1 commit to main • c0a7754…69c14e5 • 
on Feb 15

.actually fix target modules plot

filyppushed 1 commit to main • 17e2aea…c0a7754 • 
on Feb 14

.fix target plot basilines and trimmed text

filyppushed 1 commit to main • d242975…17e2aea • 
on Feb 14

.typo

filyppushed 1 commit to main • c7b30e3…d242975 • 
on Feb 14

fix cruelty baselines

filyppushed 1 commit to main • 855d7a8…c7b30e3 • 
on Feb 14

. SIU -> MUDMAN

filyppushed 1 commit to main • 3e727a9…855d7a8 • 
on Feb 14

use % in wmdp

filyppushed 1 commit to main • 1ea6c53…3e727a9 • 
on Feb 13

invert wmdp plot

filyppushed 1 commit to main • 41684df…1ea6c53 • 
on Feb 12

.wmdp optuna plots

filyppushed 1 commit to main • e4eea0c…41684df • 
on Feb 9

.nicer target modules plot

filyppushed 1 commit to main • b314bae…e4eea0c • 
on Feb 8

wmdp plot

filyppushed 1 commit to main • 6d2d55f…b314bae • 
on Feb 8

improve 2x3 plots and optuna plots

filyppushed 1 commit to main • 3494597…6d2d55f • 
on Feb 8

2x3 plot, not 3x2

filyppushed 1 commit to main • a3b18f4…3494597 • 
on Feb 8

compare to neg entropy ablations

filyppushed 1 commit to main • c193c98…a3b18f4 • 
on Feb 8