You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fewer MulAddMul branches in Diagonal-triangular mul (#1272)
This reduces TTFX in `Diagonal` - triangular multiplications.
```julia
julia> using Random, LinearAlgebra
julia> A = rand(4,4); U = UpperTriangular(A); D = Diagonal(A);
julia> @time D * U;
0.131110 seconds (239.39 k allocations: 12.162 MiB, 99.95% compilation time) # master
0.102569 seconds (227.44 k allocations: 11.472 MiB, 99.94% compilation time) # this PR
```
If the `Diagonal` is on the right, the TTFX is almost identical, but
allocations go down slightly.
```julia
julia> @time U * D;
0.125025 seconds (221.82 k allocations: 11.252 MiB, 99.95% compilation time) # master
0.127002 seconds (215.59 k allocations: 10.938 MiB, 12.06% gc time, 99.95% compilation time) # this PR
```
0 commit comments