-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attention_adapter.params.grad为None #28
Comments
代码在单 gpu,默认setting下应该不会有问题,有可能是因为使用了pipeling parallelism |
我是使用了单gpu。最开始直接运行时,程序会报错: 然后我发现用 loss = F.cross_entropy(output['logits'], label)计算出的loss的require_gard是false |
也许是哪里设置了torch.no_grad/inference_mode,或者是你再在params 和attention相乘的时候设一下params.requires_grad=True,或者是如果你用了flash_attention的话可能会有问题(现在的代码没支持这个,我看最新的flash attention似乎支持了乘mask以及backward,如果你需要的话可以自己适配一下 |
是flash attention的问题,我重新建了一个环境就解决了,感谢! |
请问可以说一下新环境的配置吗,现在也遇到了类似的问题 |
我是按照requirements.txt里面配置,稍微改了一下: |
作者你好!我在运行源代码attention_attr.py时报错:
File "/aiarena/gpfs/label-words-are-anchors/attention_attr.py", line 144, in
saliency = attentionermanger.grad(use_abs=True)[i]
File "/aiarena/gpfs/label-words-are-anchors/icl/analysis/attentioner_for_attribution.py", line 104, in grad
grads.append(self.grad_process(attention_adapter.params.grad,*args,**kwargs))
AttributeError: 'NoneType' object has no attribute 'grad'
请问这是什么原因造成的?
The text was updated successfully, but these errors were encountered: