Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 1.3k
Star 14.1k

Code
Issues 576
Pull requests 46
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: Dao-AILab/flash-attention

Labels 9 Milestones 0

New pull request New

46 Open 167 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Fix] Use CUDA 12.4 in the build system so that the current build system stops failing

#1326 opened Nov 10, 2024 by EthanSteinberg

Loading…

Promote wheels as alternative to pip install flash-attn

#1297 opened Oct 25, 2024 by simonw

Loading…

fix: in newer versions of triton, tl.dot should take as input only q …

#1288 opened Oct 21, 2024 by EdouardYvinec

Loading…

Fix compilation with clang on ARM64

#1285 opened Oct 18, 2024 by sclarkson

Loading…

[AMD] Triton Backend for ROCm

#1203 opened Sep 4, 2024 by micmelesse

Loading…

flashattnvarlen support tree attention

#1188 opened Aug 30, 2024 by efsotr

Loading…

the test_flash_attn.py it's actually in parent directory

#1167 opened Aug 21, 2024 by ArtificialZeng

Loading…

Add support for qk hidden dim different from v hidden dim

#1166 opened Aug 20, 2024 by smallscientist1

Loading…

add softmax_d for mha_bwd

#1161 opened Aug 19, 2024 by MayDomine

Loading…

Add how to import FA3 to documentation.

#1112 opened Jul 31, 2024 by AdamLouly

Loading…

Fix: bwd may need to first allocate cuda mem for rng_state

#1077 opened Jul 20, 2024 by jundaf2

Loading…

Windows actions

#1036 opened Jul 9, 2024 by bdashore3

Loading…

change condition to num_heads >= num_heads_k

#1030 opened Jul 5, 2024 by xenshinu

Loading…

[Draft] support qk head_dim different from vo head_dim

#980 opened Jun 6, 2024 by defei-coder

Loading…

Fix +/-inf in LSE returned by forward

#978 opened Jun 3, 2024 by sgrigory

Loading…

add pyproject.toml with build dependencies

#958 opened May 17, 2024 by dhellmann

Loading…

Relative position encoding

#956 opened May 14, 2024 by b-albar

Loading…

1 of 4 tasks

ALiBi for the non-flash code path

#858 opened Feb 29, 2024 by Markus28

Loading…

Add local version identifier to package metadata for pre-built wheels

#856 opened Feb 28, 2024 by yundai424

Loading…

Add support for small page sizes

#824 opened Feb 13, 2024 by skrider

Loading…

Add C++ build support for use with LibTorch

#819 opened Feb 9, 2024 by shaltielshmid

Loading…

meta tensor stuff

#769 opened Jan 15, 2024 by tsengalb99

Loading…

Animations for Flash Attention, Flash Attention2, and Standard Attention

#736 opened Dec 24, 2023 by LuisAVasquez

Loading…

Jetson (aarch64) support

#724 opened Dec 14, 2023 by jasl

Loading…

feat(attention): add Bi-Directional MLM attention model

#721 opened Dec 12, 2023 by TamirFriedman-RecoLabs • Draft

Previous 1 2 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly