Skip to content

Conversation

@sedrick-keh-tri
Copy link
Collaborator

Follows what OLMo does with their HF integration.

This allows us to work with HF without having to create new classes in the upstream transformers repo. We now directly read from this repo, so we also don't need to worry about the OpenLM codebase being updated in the future.

Usage is exactly the same as the standard HF usage, except with the additional import in the first line.

from open_lm_hf import *

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("tri-ml/openlm-7b-300b")
model = AutoModelForCausalLM.from_pretrained("tri-ml/openlm-7b-300b")
a = tokenizer("Hi, nice to meet you.", return_tensors='pt')
out = model.to("cuda").generate(a['input_ids'].to("cuda"))
print(tokenizer.decode(out[0]))

Some things not implemented yet:

  • some extra HF functions like ResizeTokenEmbeddings
  • The HF forward output tuple CausalLMOutputWithPast usually returns the full hidden states, but OpenLM's forward doesn't return that, so I left it as None for now.
  • There's also a chunk if labels is not None: that I just copied from OLMo and didn't test.

@sedrick-keh-tri
Copy link
Collaborator Author

HF model repo config.json should look something like this

{
  "dim": 4096,
  "n_layers": 32, 
  "n_heads": 32, 
  "vocab_size": 50432,
  "norm_eps": 1e-5,
  "seq_len": 2048,
  "weight_tying": false,
  "apply_qk_norm": true,
  "norm_type": "gain_only_lp_layer_norm",
  "positional_embedding_type": "rotary",
  "ffn_type": "swiglu"
}

These correspond to attributes in the Params class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant