-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Pull requests: Dao-AILab/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Fix] Use CUDA 12.4 in the build system so that the current build system stops failing
#1326
opened Nov 10, 2024 by
EthanSteinberg
Loading…
Promote wheels as alternative to pip install flash-attn
#1297
opened Oct 25, 2024 by
simonw
Loading…
fix: in newer versions of triton, tl.dot should take as input only q …
#1288
opened Oct 21, 2024 by
EdouardYvinec
Loading…
the test_flash_attn.py it's actually in parent directory
#1167
opened Aug 21, 2024 by
ArtificialZeng
Loading…
Add support for qk hidden dim different from v hidden dim
#1166
opened Aug 20, 2024 by
smallscientist1
Loading…
Fix: bwd may need to first allocate cuda mem for rng_state
#1077
opened Jul 20, 2024 by
jundaf2
Loading…
[Draft] support qk head_dim different from vo head_dim
#980
opened Jun 6, 2024 by
defei-coder
Loading…
Add local version identifier to package metadata for pre-built wheels
#856
opened Feb 28, 2024 by
yundai424
Loading…
Animations for Flash Attention, Flash Attention2, and Standard Attention
#736
opened Dec 24, 2023 by
LuisAVasquez
Loading…
feat(attention): add Bi-Directional MLM attention model
#721
opened Dec 12, 2023 by
TamirFriedman-RecoLabs
•
Draft
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.