Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Llama-3.2-3B needle evaluation #18

Open
prakamya-mishra opened this issue Nov 9, 2024 · 0 comments
Open

Error in Llama-3.2-3B needle evaluation #18

prakamya-mishra opened this issue Nov 9, 2024 · 0 comments

Comments

@prakamya-mishra
Copy link

Hi @FranxYao if I use your code for evaluating Llama-3.2-3B model, specifically:

scaling_factor = 10 # hardcode
reset_rope(self.model_to_test, model_max_train_len=81920, scaling_factor=scaling_factor)

It throws the following error:

AttributeError: 'LlamaRotaryEmbedding' object has no attribute '_set_cos_sin_cache'

So if I comment this part out then I get the following results:
Llama-3 2-3B

This is unexpected as Llama-3.2-3B model claims to support a context length up to 128K. Do you also get this erro? or how do you handle this?

What should be the the correct way to evaluate Llama-3.2-3B model?

I downloaded the llama model using:

from huggingface_hub import snapshot_download

snapshot_download(repo_id='meta-llama/Llama-3.2-3B',
                  local_dir='<path>/Llama-3.2-3B',
                  repo_type='model',
                  local_dir_use_symlinks=F

And run command is:

(
python -u needle_in_haystack.py --s_len 0 --e_len 128000\
    --model_provider LLaMA\
    --model_path <path>/Llama-3.2-3B
) 2>&1  | tee logs/Llama-3_2-3B.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant