Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture Requests for Mamba #1030

Open
hg0428 opened this issue Oct 10, 2024 · 3 comments
Open

Architecture Requests for Mamba #1030

hg0428 opened this issue Oct 10, 2024 · 3 comments

Comments

@hg0428
Copy link

hg0428 commented Oct 10, 2024

I would like support the following architectures:

  • Mamba
  • MambaByte
  • Mamba-2
  • Mamba-hybrid (mamba + transformer)
  • Mamba-2-hybrid (mamba2 + transformer)

These architectures are becoming quite common now and are supported by most major LLM libraries.

@hg0428 hg0428 changed the title Architecture Requests: Mamba, Mamba-2,MambaByte, Mamba-hybrid, Mamba-2-hybrid Architecture Requests for Mamba Oct 10, 2024
@awni awni transferred this issue from ml-explore/mlx Oct 10, 2024
@awni
Copy link
Member

awni commented Oct 10, 2024

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

@hg0428
Copy link
Author

hg0428 commented Oct 10, 2024

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Mamba: https://huggingface.co/tiiuae/falcon-mamba-7b
Mamba-2: https://huggingface.co/state-spaces/mamba2-2.7b
MambaByte: https://huggingface.co/JunxiongWang/MambaByte_Books
Mamba-Hybrid: https://huggingface.co/Zyphra/Zamba-7B-v1
Mamba2-Hybrid: https://huggingface.co/Zyphra/Zamba2-2.7B-instruct

@hg0428
Copy link
Author

hg0428 commented Oct 15, 2024

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Zamba2 7b was just released. One of the best models of its size, it outperforms Llama3.2 11b and Mistral 7b in almost every benchmark.
It is a Mamba2-hybrid model.
https://www.zyphra.com/post/zamba2-7b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants