NotImplementedError running HF model "mlfoundations/dclm-7b-it" for inference 

I am trying to use the HF model "mlfoundations/dclm-7b-it" for inference, simply using the code below:

```
model = AutoModelForCausalLM.from_pretrained("mlfoundations/dclm-7b-it")
gen_kwargs = {"max_new_tokens": 500, "temperature": 0}
output = model.generate(inputs['input_ids'], **gen_kwargs)
```

I see this warning when loading the model:
`Some weights of OpenLMForCausalLM were not initialized from the model checkpoint at mlfoundations/dclm-7b-it and are newly initialized: [...]`

And I get NotImplementedError:
```
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 3, 32, 128) (torch.float32)
     key         : shape=(1, 3, 32, 128) (torch.float32)
     value       : shape=(1, 3, 32, 128) (torch.float32)
     attn_bias   : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
     p           : 0.0
```
     

I have also tried `model = AutoModel.from_pretrained("mlfoundations/dclm-7b-it")`, but this model class also fails with ValueError: Unrecognized configuration class.
  
 Which model class should I use here? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NotImplementedError running HF model "mlfoundations/dclm-7b-it" for inference #303

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NotImplementedError running HF model "mlfoundations/dclm-7b-it" for inference #303

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions