Open
Description
Describe the bug
I have encountered an error when trying to sample from SD3 transformer component, where line 208 in attention.py
is asking for two outputs but only one is given:
File "/home/p2p/pytorch/mc3/envs/marigold/lib/python3.10/site-packages/diffusers/src/diffusers/models/attention.py", line 208, in forward
attn_output, context_attn_output = self.attn(
ValueError: not enough values to unpack (expected 2, got 1)
Reproduction
https://colab.research.google.com/drive/1CkgjIaaClKUk4ZC-g_RR578BfFE24gps#scrollTo=1xcDHPHd56WH
import logging
from diffusers.models.transformers import SD3Transformer2DModel
import torch
from typing import Optional
from torch.nn import Conv2d
from torch.nn.parameter import Parameter
device = torch.device("cuda")
cat_latents = torch.randn(1, 16, 128, 128).to(device)
timesteps = torch.tensor([453.9749]).to(device)
prompt_embeds = torch.randn(1, 154, 4096).to(device)
pooled_prompt_embeds = torch.randn(1, 2048).to(device)
model = SD3Transformer2DModel().to(device)
model.enable_xformers_memory_efficient_attention()
model_pred = model(
hidden_states=cat_latents,
timestep=timesteps,
encoder_hidden_states=prompt_embeds,
pooled_projections=pooled_prompt_embeds,
return_dict=False,
)[0]
Logs
Traceback (most recent call last):
File "/home/p2p/src/trainer/trainer.py", line 347, in train
model_pred = self.model.transformer(
File "/home/p2p/pytorch/mc3/envs/marigold/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/p2p/pytorch/mc3/envs/marigold/lib/python3.10/site-packages/diffusers/src/diffusers/models/transformers/transformer_sd3.py", line 347, in forward
encoder_hidden_states, hidden_states = block(
File "/home/p2p/pytorch/mc3/envs/marigold/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/p2p/pytorch/mc3/envs/marigold/lib/python3.10/site-packages/diffusers/src/diffusers/models/attention.py", line 208, in forward
attn_output, context_attn_output = self.attn(
ValueError: not enough values to unpack (expected 2, got 1)
System Info
- 🤗 Diffusers version: 0.32.0.dev0
- Platform: Linux-6.1.0-26-amd64-x86_64-with-glibc2.36
- Running on Google Colab?: No
- Python version: 3.10.12
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.26.2
- Transformers version: 4.46.1
- Accelerate version: 0.34.2
- PEFT version: 0.13.2
- Bitsandbytes version: not installed
- Safetensors version: 0.4.3
- xFormers version: 0.0.21
- Accelerator: NVIDIA RTX A6000, 49140 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?: