This new version of diffusers causes problems when doing inference with Flux 1.D and Wan2.2 Text To Image (explanation below)

### Describe the bug

A few days ago I decided to upgrade Torch (cuda:gpu) from v2.7.1 to 2.8.0. In addition I also decided to update diffusers from v0.35.0 to v0.36.0. I updated additional components as well due to the Torch upgrade as well as compatibility with the new version of diffusers.
llama-stack
transformers
huggingface-hub
tokenizers
xformers  (for torch 2.8)
triton-windows-3.5 (for torch 2.8)
sageattention-2.2.0  (for torch 2.8)
flash_attn-2.8.2  (for torch 2.8)

There may have been additional components added as is the case when upgrading.

I didn't change any of my code for inference. I encountered 2 major problems with the new diffusion library which may or may not be specifically attributable to diffusers alone:

1) When running Flux 1.D text to image, the model/transformer seemed to ignore parts of the prompt. When running WAN 2.2 T2I it was even worse. The model/transformer ignored the entire prompt. So basically, something changed in the environment after the updates.

2) When running WAN 2.2 T2I using "Sage" attention, I get a black image. I traced this back to the transformer attn processing. After a few iterations of attn processing, Sage Attn starts experiencing problems with the Query Tensor producing nan's... which eventually cause my black image. The K and V components don't seem affected, only the Q part. Strange.

As I stated, I didn't change any of my code for inference, only the environment as pertains to what I stated above. Has anyone else experience this or a similar problem?

It doesn't help to produce the code / logs as they don't show anything regarding the problem I stated above. This is a problem that doesn't emit an error message, only a bad or wrong image.

### Reproduction

N/A. Note that I use the diffuser library locally on my PC. Not using Focus, A111, or any other alternative library.
I should add that I'm using the Quanto Quantized (INT8) versions of the Transformers and T5 Encoders due to VRAM limitations for both Flux 1.D and WAN 2.2 and have experienced no previous problems in doing so with the diffusers library.

In the interim, I have reverted back to previous releases of diffusers-0.35.0.dist-info and transformers-4.55.4.dist-info (keeping my upgraded Torch 2,8 functionality) and am not experiencing the problems I note here.

### Logs

```shell

```

### System Info

System: Windows 10 OS; Python v12.5.0
GPU: RTX 4090
Diffusers: v0.36.0 (current)
Transformers: v5.0.0 (current)
Torch: v2.8.0 (cuda)
Cuda Toolkit: v12.8

### Who can help?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

This new version of diffusers causes problems when doing inference with Flux 1.D and Wan2.2 Text To Image (explanation below) #12878

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

This new version of diffusers causes problems when doing inference with Flux 1.D and Wan2.2 Text To Image (explanation below) #12878

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions