attention_adapter.params.grad为None #28

CoverZhao · 2024-08-22T07:44:01Z

作者你好！我在运行源代码attention_attr.py时报错：
File "/aiarena/gpfs/label-words-are-anchors/attention_attr.py", line 144, in
saliency = attentionermanger.grad(use_abs=True)[i]
File "/aiarena/gpfs/label-words-are-anchors/icl/analysis/attentioner_for_attribution.py", line 104, in grad
grads.append(self.grad_process(attention_adapter.params.grad,*args,**kwargs))
AttributeError: 'NoneType' object has no attribute 'grad'
请问这是什么原因造成的？

leanwang326 · 2024-08-24T10:56:41Z

代码在单 gpu，默认setting下应该不会有问题，有可能是因为使用了pipeling parallelism

CoverZhao · 2024-08-26T06:51:49Z

代码在单 gpu，默认setting下应该不会有问题，有可能是因为使用了pipeling parallelism

我是使用了单gpu。最开始直接运行时，程序会报错：
Traceback (most recent call last):
File "/aiarena/gpfs/label-words-are-anchors/attention_attr.py", line 143, in
loss.backward()
File "/opt/conda/envs/pytorch/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/opt/conda/envs/pytorch/lib/python3.10/site-packages/torch/autograd/init.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

然后我发现用 loss = F.cross_entropy(output['logits'], label)计算出的loss的require_gard是false
于是我加了一行 loss.requires_grad=True,再次运行时就有了上文说的attention_adapter.params.grad为None 的报错

leanwang326 · 2024-08-26T07:04:39Z

也许是哪里设置了torch.no_grad/inference_mode，或者是你再在params 和attention相乘的时候设一下params.requires_grad=True，或者是如果你用了flash_attention的话可能会有问题（现在的代码没支持这个，我看最新的flash attention似乎支持了乘mask以及backward，如果你需要的话可以自己适配一下

CoverZhao · 2024-08-26T07:39:13Z

也许是哪里设置了torch.no_grad/inference_mode，或者是你再在params 和attention相乘的时候设一下params.requires_grad=True，或者是如果你用了flash_attention的话可能会有问题（现在的代码没支持这个，我看最新的flash attention似乎支持了乘mask以及backward，如果你需要的话可以自己适配一下

是flash attention的问题，我重新建了一个环境就解决了，感谢！

lilhongxy · 2024-09-11T01:26:03Z

请问可以说一下新环境的配置吗，现在也遇到了类似的问题

CoverZhao · 2024-09-11T01:38:13Z

请问可以说一下新环境的配置吗，现在也遇到了类似的问题

我是按照requirements.txt里面配置，稍微改了一下：
datasets
ipython==8.11.0
matplotlib==3.7.1
numpy
seaborn==0.12.2
tqdm==4.65.0
transformers==4.37.0
然后torch直接去官网上装的

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attention_adapter.params.grad为None #28

attention_adapter.params.grad为None #28

CoverZhao commented Aug 22, 2024

leanwang326 commented Aug 24, 2024

CoverZhao commented Aug 26, 2024

leanwang326 commented Aug 26, 2024

CoverZhao commented Aug 26, 2024

lilhongxy commented Sep 11, 2024

CoverZhao commented Sep 11, 2024

attention_adapter.params.grad为None #28

attention_adapter.params.grad为None #28

Comments

CoverZhao commented Aug 22, 2024

leanwang326 commented Aug 24, 2024

CoverZhao commented Aug 26, 2024

leanwang326 commented Aug 26, 2024

CoverZhao commented Aug 26, 2024

lilhongxy commented Sep 11, 2024

CoverZhao commented Sep 11, 2024