Additional intrinsic optimizations? #214
Description
At the moment there's the llvm_intrinsically_optimized!
macro which, when using the unstable flag, will call an unstable LLVM intrinsic.
However, there's some opportunities for using intrinsics (edit: hardware intrinsics) in stable, and even in core, if we wanted to reach for SSE / SSE2 / etc when available (compile time detected).
For example, libm
defines sqrt with a full software implementation, but if people call it in std
they get either (in debug) the sqrtss
instruction with some indirection in between or (in release) the sqrtss
instruction without any indirection. Based on this, I think it would be fine to have libm
also just use the sqrtss
instruction when available.
Of course this should probably be behind its own feature flag, but I think it would be a reasonable progression to develop in this direction of using stable hardware intrinsics when possible.