Skip to content

Commit 820d45e

Browse files
improve docs
1 parent 16b9fe1 commit 820d45e

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

src/attention.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ See also [`dot_product_attention_scores`](@ref) if you only need the attention s
2222
- `value`: Value array of size `(v_dim, kv_len, batch_size...)`.
2323
- `bias`: Either `nothing` or an array broadcastable to size `(kv_len, q_len, nheads, batch_size)`.
2424
It will be added to the attention scores before applying the softmax. Default `nothing`.
25-
- `fdrop`: A dropout function or layer to apply on the attention scores. Default `identity` (no dropout).
25+
- `fdrop`: A dropout function or layer to be applied on the attention scores right after the softmax.
26+
Default `identity` (no dropout).
2627
- `mask`: Either `nothing` or a boolean array broadcastable to size `(kv_len, q_len, nheads, batch_size)`.
2728
The mask is applied to the attention scores before the softmax.
2829
Can also be set to `mask=:causal` to apply a causal mask. Default `nothing`.

0 commit comments

Comments
 (0)