-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Add Aiter Attention Backend Support on AMD GPUs #10511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add Aiter Attention Backend Support on AMD GPUs #10511
Conversation
|
@Kosinkadink @comfyanonymous Hi, can you help review this PR? It will be widely available on AMD GPUs, thanks very much! |
|
I gave this a quick test and the integration doesn't seem to break anything, but unfortunately I wasn't able to get the aiter library working on my system so I can't say how useful this is; its JIT compilation failed with errors that seemed to involve assembly code. (maybe it requires a newer CPU than I have, or it doesn't support XTX 7900? I don't know.) If this is merged, a bit of documentation about what it is and what kind of setups are supported wouldn't hurt. EDIT: looks like consumer GPUs are just not supported, so it seems this is a pretty niche thing. |
comfy/ldm/modules/attention.py
Outdated
| AITER_ATTENTION_IS_AVAILABLE = True | ||
| except ImportError: | ||
| if model_management.aiter_attention_enabled(): | ||
| logging.error(f"\n\nTo use the `--use-aiter-attention` feature, the `aiter` package must be installed first.\ncommand:\n\t{sys.executable} -m pip install aiter") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pip install aiter" doesn't install the right aiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have change it. Because AITER doesn't provide whl package now, so we just tell user to refer to AITER repo, hope users can install it smoothly!
|
Getting this when trying the neta yume workflow in the ComfyUI templates: |
comfy/ldm/modules/attention.py
Outdated
| return out | ||
|
|
||
|
|
||
| def aiter_flash_attn_wrapper(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you doing this? Just put aiter.flash_attn_func in the attention_aiter function directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you doing this? Just put aiter.flash_attn_func in the attention_aiter function directly.
DONE
Yes, aiter is mainly for server GPU like MI300x/MI355, but I think consumer GPUs will be considered in the near future. I will clarify these requirements in code comments, thanks for try! |
788c6b0 to
e00688e
Compare

Overview
This PR adds support for the AITER attention backend to ComfyUI, providing an alternative high-performance attention implementation that can significantly improve inference speed on AMD GPUs(like MI300x, MI355)
Usage
To enable AIter attention, start ComfyUI with the
--use-aiter-attentionflag:Performance Improvements
Tested on Qwen-Image model (KSampler, main time-consuming part):
Before: 1.27 iter/s
After: 1.48 iter/s
Speedup: ~16.5% improvement