We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi @FranxYao if I use your code for evaluating Llama-3.2-3B model, specifically:
scaling_factor = 10 # hardcode reset_rope(self.model_to_test, model_max_train_len=81920, scaling_factor=scaling_factor)
It throws the following error:
AttributeError: 'LlamaRotaryEmbedding' object has no attribute '_set_cos_sin_cache'
So if I comment this part out then I get the following results:
This is unexpected as Llama-3.2-3B model claims to support a context length up to 128K. Do you also get this erro? or how do you handle this?
What should be the the correct way to evaluate Llama-3.2-3B model?
I downloaded the llama model using:
from huggingface_hub import snapshot_download snapshot_download(repo_id='meta-llama/Llama-3.2-3B', local_dir='<path>/Llama-3.2-3B', repo_type='model', local_dir_use_symlinks=F
And run command is:
( python -u needle_in_haystack.py --s_len 0 --e_len 128000\ --model_provider LLaMA\ --model_path <path>/Llama-3.2-3B ) 2>&1 | tee logs/Llama-3_2-3B.log
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi @FranxYao if I use your code for evaluating Llama-3.2-3B model, specifically:
It throws the following error:
So if I comment this part out then I get the following results:
![Llama-3 2-3B](https://private-user-images.githubusercontent.com/22093105/384556021-7c620ee8-9acb-44b6-8ead-15e100f160a3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5ODc4NjIsIm5iZiI6MTczODk4NzU2MiwicGF0aCI6Ii8yMjA5MzEwNS8zODQ1NTYwMjEtN2M2MjBlZTgtOWFjYi00NGI2LThlYWQtMTVlMTAwZjE2MGEzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA4VDA0MDYwMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWIyNjFkMzM2MGZmMTdkNDExYTNhZTgzY2JmOGU3ZjU2OTJmOThmMDcyNTI0NjhiZmNhY2M5YTQ1MjhkOWIyMzQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.XTpWf25dolUA_M_cEKx1uYNO9LOWniZ8cDtJZeI0hLM)
This is unexpected as Llama-3.2-3B model claims to support a context length up to 128K. Do you also get this erro? or how do you handle this?
What should be the the correct way to evaluate Llama-3.2-3B model?
I downloaded the llama model using:
And run command is:
The text was updated successfully, but these errors were encountered: