Skip to content

Enable SSE2 intrinsics code and Elbrus CPU support#2

Open
makise-homura wants to merge 2 commits intolscsoft:masterfrom
makise-homura:e2k
Open

Enable SSE2 intrinsics code and Elbrus CPU support#2
makise-homura wants to merge 2 commits intolscsoft:masterfrom
makise-homura:e2k

Conversation

@makise-homura
Copy link

I can't make pull requests to main LIGO git (https://git.ligo.org/lscsoft/lalsuite), so I make a pull request here. It is surely pushable to the main repository though.

In GCT app, there is SSE code written in x86_64 assembler; but vector extensions are available on some other CPUs, like Elbrus-8S and Elbrus-8SV (https://en.wikipedia.org/wiki/Elbrus-8S). So I added an implementation based on compiler instrinsics (and some specific targets for eah_HierarchSearchGCT binary to make this implementation buildable without modifying the Makefile), to make E@H application be able to be built and run efficiently (I achiveved about 3 times performance increase) on platforms that do not support x86_64 assembler (especially Elbrus).

There are CPUs with vector extensions other than x86 and x86_64.
A support for optimized calculations for them is based on compiler
intrinsics. So, this support is added.
Also, this support is explicitly forced for Elbrus (E2K) processors.
eah_HierarchSearchGCT_SSE2: Force SSE2 support,
Implementation (intrinsics / x86_64 assembler) chosen automatically.
eah_HierarchSearchGCT_SSE2_NOOPT: Force generic C code + ivdep for SSE2.
eah_HierarchSearchGCT_SSE2_INTRIN: Force SSE2 code based on intrinsics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant