Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I tried madlad400, but there is a problem with the output if it is float16 #980

Open
otmb opened this issue Sep 7, 2024 · 5 comments
Open

Comments

@otmb
Copy link

otmb commented Sep 7, 2024

Hi.

I tried madlad400, but there is a problem with the output if it is float16

$ python convert.py --model google/madlad400-3b-mt
$ python t5.py --model google/madlad400-3b-mt --prompt "<2ja>A tasty apple"

[INFO] Generating with T5...
Input:  <2ja>A tasty apple
リンゴの味
Time: 20.28 seconds, tokens/s: 0.30

$ python convert.py --model google/madlad400-3b-mt --dtype float16
$ python t5.py --model google/madlad400-3b-mt --dtype float16 --prompt "<2ja>A tasty apple"

[INFO] Generating with T5...
Input:  <2ja>A tasty apple
<unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk>
Time: 25.66 seconds, tokens/s: 3.90

Thank.

@awni
Copy link
Member

awni commented Sep 7, 2024

Indeed.. the T5 models typically don't work well in fp16. Probably they need some kind of activation clipping or rescaling to fix this. mx.bfloat16 should work though.

@otmb
Copy link
Author

otmb commented Sep 7, 2024

Thank.
It worked fine.

$ python convert.py --model google/madlad400-3b-mt --dtype float16
$ python t5.py --model google/madlad400-3b-mt --dtype bfloat16 --prompt "<2ja>A tasty apple"

[INFO] Generating with T5...
Input:  <2ja>A tasty apple
リンゴの味
Time: 18.48 seconds, tokens/s: 0.32

My machine has low memory so it's swapping so it's slow. hahaha.

My hope is that the file size is still large, so it would be nice if it could be used in int8 as well.

$ ls -lah google-madlad400-3b-mt.npz
6.5G  google-madlad400-3b-mt.npz

Thank.

@HydrogenBombaklot
Copy link

Do you mind uploading your madlad-400 mlx to HF?

@otmb
Copy link
Author

otmb commented Oct 16, 2024

Please don't worry about it

@HydrogenBombaklot
Copy link

No, I mean I want to use it and easier if it's on HF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants