Skip to content

Conversation

@k50112113
Copy link

@k50112113 k50112113 commented Sep 29, 2025

Previously, VLLM_ROCM_USE_AITER_TRITON_FUSED_ROPE_ZEROS_KV_CACHE will be disabled if VLLM_ROCM_USE_AITER_MHA is turned on.

This PR enables VLLM_ROCM_USE_AITER_MHA and VLLM_ROCM_USE_AITER_TRITON_FUSED_ROPE_ZEROS_KV_CACHE both to be turned on

This change would affect Llama and GPT-OSS

Copy link
Collaborator

@dllehr-amd dllehr-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved!

@dllehr-amd dllehr-amd merged commit 2b4cb8a into 355_wip Oct 22, 2025
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants