Skip to content

Suggestion for a low-effort way to take advantage of SIMD and other architecture specific tricks LLVM knows about #42432

Closed
@pedrocr

Description

@pedrocr

Issue #27731 already tracks the fine work being done to expose SIMD in ways that are explicit to the programmer. If you're able to code in those specific ways big gains can be obtained. However there is something simple can be done before to performance sensitive code that sometimes greatly improves its speed, just tell LLVM to take advantage of those instructions. The speedup from that is free in developer time and can be quite large. I extracted a simple benchmark from one of the computationally expensive functions in rawloader, matrix multiplying camera RGB values to get XYZ:

https://github.com/pedrocr/rustc-math-bench

I programmed the same multiplication over a 100MP image in both C and rust. Here are the results. All values are in ms/megapixel run on a i5-6200U. The runbench script in the repository will compile and run the tests for you with no other interaction.

Compiler -O3 -O3 -march=native
rustc 1.19.0-nightly (e0cc22b 2017-05-31) 11.76 6.92 (-41%)
clang 3.8.0-2ubuntu4 13.31 5.69 (-57%)
gcc 5.4.0 20160609 7.77 4.70 (-40%)

So rust nightly is faster than clang (but that's probably llvm 3.8 vs 4.0) and the reduction in runtime is quite worthwile. The problem with doing this of course is that now the binary is not portable to architectures lower than mine and it's not optimized for archictures above it either.

My suggestion is to allow the developer to do something like #[makefast] fn(...). Anything that gets annotated like that gets compiled multiple times for each of the architecture levels and then at runtime, depending on the machine being used, the highest level gets used. Ideally also patch the call sites on program startup (or use ELF trickery) so the dispatch penalty disappears.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions