Architecture Requests for Mamba #1030

hg0428 · 2024-10-10T14:06:32Z

I would like support the following architectures:

Mamba
MambaByte
Mamba-2
Mamba-hybrid (mamba + transformer)
Mamba-2-hybrid (mamba2 + transformer)

These architectures are becoming quite common now and are supported by most major LLM libraries.

awni · 2024-10-10T14:35:07Z

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

hg0428 · 2024-10-10T14:38:01Z

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Mamba: https://huggingface.co/tiiuae/falcon-mamba-7b
Mamba-2: https://huggingface.co/state-spaces/mamba2-2.7b
MambaByte: https://huggingface.co/JunxiongWang/MambaByte_Books
Mamba-Hybrid: https://huggingface.co/Zyphra/Zamba-7B-v1
Mamba2-Hybrid: https://huggingface.co/Zyphra/Zamba2-2.7B-instruct

hg0428 · 2024-10-15T12:15:54Z

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Zamba2 7b was just released. One of the best models of its size, it outperforms Llama3.2 11b and Mistral 7b in almost every benchmark.
It is a Mamba2-hybrid model.
https://www.zyphra.com/post/zamba2-7b

hg0428 changed the title ~~Architecture Requests: Mamba, Mamba-2,MambaByte, Mamba-hybrid, Mamba-2-hybrid~~ Architecture Requests for Mamba Oct 10, 2024

awni transferred this issue from ml-explore/mlx Oct 10, 2024

hg0428 mentioned this issue Nov 10, 2024

Support for AI21 Jamba-1.5 #1097

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture Requests for Mamba #1030

Architecture Requests for Mamba #1030

hg0428 commented Oct 10, 2024

awni commented Oct 10, 2024

hg0428 commented Oct 10, 2024

hg0428 commented Oct 15, 2024 •

edited

Loading

Architecture Requests for Mamba #1030

Architecture Requests for Mamba #1030

Comments

hg0428 commented Oct 10, 2024

awni commented Oct 10, 2024

hg0428 commented Oct 10, 2024

hg0428 commented Oct 15, 2024 • edited Loading

hg0428 commented Oct 15, 2024 •

edited

Loading