Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reorder cache for beam search #526

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased
- Added _reorder_cache function to enable beam search when generating with HF

### Changed

Expand Down
11 changes: 9 additions & 2 deletions hf_olmo/modeling_olmo.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,8 +133,15 @@ def prepare_inputs_for_generation(
# def get_position_embeddings(self) -> Union[nn.Embedding, Tuple[nn.Embedding]]:
# pass
#
# def _reorder_cache(self, past_key_values, beam_idx):
# pass

def _reorder_cache(self, past_key_values: Tuple[Tuple[torch.Tensor]], beam_idx: torch.Tensor
) -> Tuple[Tuple[torch.Tensor]]:
reordered_past = ()
for layer_past in past_key_values:
reordered_past += (
tuple(past_state.index_select(0, beam_idx.to(past_state.device)) for past_state in layer_past),
)
return reordered_past

def get_input_embeddings(self) -> torch.nn.Module:
return self.model.transformer.wte
Expand Down
1 change: 1 addition & 0 deletions scripts/add_code_eval.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Script to create perplexity eval datasets for code.
"""

import os

import pandas as pd
Expand Down
1 change: 1 addition & 0 deletions scripts/init_config.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Run this to initialize a new training config to a file.
"""

import logging
import sys
from pathlib import Path
Expand Down
1 change: 1 addition & 0 deletions scripts/show_model_size.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
python scripts/show_model_size.py train_config.yaml
```
"""

import logging
import sys

Expand Down
Loading