flash_attn_interface (FA3) missing from runpod/parameter-golf:latest image

## Issue

The official RunPod template (`y5cejece4j`, image `runpod/parameter-golf:latest`) does **not** include `flash_attn_interface` (Flash Attention 3). Attempting to import it fails:

```python
>>> from flash_attn_interface import flash_attn_func
ModuleNotFoundError: No module named 'flash_attn_interface'
```

The standard `flash_attn` package (FA2) is also missing.

## Impact

Multiple top submissions (#198, #254, #265) use FA3 with a bare import — no try/except fallback:

```python
from flash_attn_interface import flash_attn_func as flash_attn_3_func
```

These scripts would crash on the official RunPod template. This creates confusion about the evaluation environment:

- The README says *"All Python dependencies are already pre-installed in the image"* and *"Evaluation will be in the RunPod environment with all packages installed"*
- But FA3 is not installed, and it's the attention backend used by every top submission

## Environment

- Image: `runpod/parameter-golf:latest`
- Template: `y5cejece4j`
- PyTorch: 2.9.1+cu128
- CUDA: 12.8

Verified by inspecting Docker Hub layer history — no flash-attn install step exists.

## Questions

1. Does the official evaluation environment differ from the RunPod template?
2. Should `flash_attn_interface` be added to the image?
3. Should submissions include a fallback to `F.scaled_dot_product_attention` for compatibility?

## Workaround

For participants: use a try/except fallback to PyTorch SDPA:

```python
try:
    from flash_attn_interface import flash_attn_func as flash_attn_3_func
except ImportError:
    def flash_attn_3_func(q, k, v, causal=True):
        q, k, v = (x.transpose(1, 2) for x in (q, k, v))
        out = F.scaled_dot_product_attention(q, k, v, is_causal=causal, enable_gqa=True)
        return out.transpose(1, 2)
```

This works on PyTorch 2.6+ but is ~10% slower than FA3 on H100.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flash_attn_interface (FA3) missing from runpod/parameter-golf:latest image #280

Issue

Impact

Environment

Questions

Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

flash_attn_interface (FA3) missing from runpod/parameter-golf:latest image #280

Description

Issue

Impact

Environment

Questions

Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions