-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
Issue
The official RunPod template (y5cejece4j, image runpod/parameter-golf:latest) does not include flash_attn_interface (Flash Attention 3). Attempting to import it fails:
>>> from flash_attn_interface import flash_attn_func
ModuleNotFoundError: No module named 'flash_attn_interface'The standard flash_attn package (FA2) is also missing.
Impact
Multiple top submissions (#198, #254, #265) use FA3 with a bare import — no try/except fallback:
from flash_attn_interface import flash_attn_func as flash_attn_3_funcThese scripts would crash on the official RunPod template. This creates confusion about the evaluation environment:
- The README says "All Python dependencies are already pre-installed in the image" and "Evaluation will be in the RunPod environment with all packages installed"
- But FA3 is not installed, and it's the attention backend used by every top submission
Environment
- Image:
runpod/parameter-golf:latest - Template:
y5cejece4j - PyTorch: 2.9.1+cu128
- CUDA: 12.8
Verified by inspecting Docker Hub layer history — no flash-attn install step exists.
Questions
- Does the official evaluation environment differ from the RunPod template?
- Should
flash_attn_interfacebe added to the image? - Should submissions include a fallback to
F.scaled_dot_product_attentionfor compatibility?
Workaround
For participants: use a try/except fallback to PyTorch SDPA:
try:
from flash_attn_interface import flash_attn_func as flash_attn_3_func
except ImportError:
def flash_attn_3_func(q, k, v, causal=True):
q, k, v = (x.transpose(1, 2) for x in (q, k, v))
out = F.scaled_dot_product_attention(q, k, v, is_causal=causal, enable_gqa=True)
return out.transpose(1, 2)This works on PyTorch 2.6+ but is ~10% slower than FA3 on H100.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels