-
Notifications
You must be signed in to change notification settings - Fork 75
Open
Description
I am trying to use the HF model "mlfoundations/dclm-7b-it" for inference, simply using the code below:
model = AutoModelForCausalLM.from_pretrained("mlfoundations/dclm-7b-it")
gen_kwargs = {"max_new_tokens": 500, "temperature": 0}
output = model.generate(inputs['input_ids'], **gen_kwargs)
I see this warning when loading the model:
Some weights of OpenLMForCausalLM were not initialized from the model checkpoint at mlfoundations/dclm-7b-it and are newly initialized: [...]
And I get NotImplementedError:
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(1, 3, 32, 128) (torch.float32)
key : shape=(1, 3, 32, 128) (torch.float32)
value : shape=(1, 3, 32, 128) (torch.float32)
attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
p : 0.0
I have also tried model = AutoModel.from_pretrained("mlfoundations/dclm-7b-it"), but this model class also fails with ValueError: Unrecognized configuration class.
Which model class should I use here?
Metadata
Metadata
Assignees
Labels
No labels