-
Notifications
You must be signed in to change notification settings - Fork 217
Issue 39 support othellogpt in SAELens #317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 39 support othellogpt in SAELens #317
Conversation
|
Thanks @decandido and @zhenningdavidliu! This has some overlap with #294. Specifically, I'd like to use the seqpos slicing instead of the start and end pos offsets. Other than that this looks good I think! Requests:
Once that's done I'll accept :) |
chanind
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I userstand this correctly, these offset parameters only makes sense in cases where the model is given sequences of a fixed length, and context_size is set to that length, e.g. OthelloGPT?
|
|
||
| stacked_activations = torch.zeros((n_batches, n_context, 1, self.d_in)) | ||
| # For some models, we might want to exclude some positions from the sequence to train on | ||
| training_context_slice = list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inefficient to create a list of indices like this. It's better to create an actual slice object, like training_context_slice = slice(self.start_pos_offset, n_context - self.end_pos_offset). Alternative, you can probably just do something like the following directly when selecting the specific indices.
stacked_activations[:, :, 0] = layerwise_activations[self.hook_name][
:, start_pos:end_pos, self.hook_head_index
]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey David, thanks for your thorough review of our PR! As Joseph mentioned in his comment above, we've rebased on Callum's PR and added a few tests of our own. These problems are not present in the rebased code, but they are really helpful for @zhenningdavidliu and me! 🙏
7c333e8 to
1b39d82
Compare
…han dataset lengths for tokenized datasets.
8539145 to
9130ff9
Compare
Thanks @jbloomAus for reviewing our PR! We've rebased on #294 and removed the start and end pos offsets. We also added a short script to train SAEs on othelloGPT ( |
|
Hey, thanks for all your awesome work here! Could I possibly bump on this PR? Just cause it would also be super useful for the final bits of the ARENA material to run smoothly (-: |
chanind
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Will leave it to Joseph to merge
|
Thanks! (I'm actually also realizing that the context length issue wasn't a part of this PR, it was an earlier one, and that was the main thing that was making things a bit clunky, so less important thank I thought - but would still be great!) |
Description
Adding support for OthelloGPT. Adding logic to train on sequences with a start and end position offset similar to PR 294. Adding tests for the added logic and to benchmark the OthelloGPT SAETrainingRunner.
Fixes # Issue 39
Type of change
Please delete options that are not relevant.
Checklist:
You have tested formatting, typing and unit tests (acceptance tests not currently in use)
make check-cito check format and linting. (you can runmake formatto format code if needed.)