Enable SSE2 intrinsics code and Elbrus CPU support#2
Open
makise-homura wants to merge 2 commits intolscsoft:masterfrom
Open
Enable SSE2 intrinsics code and Elbrus CPU support#2makise-homura wants to merge 2 commits intolscsoft:masterfrom
makise-homura wants to merge 2 commits intolscsoft:masterfrom
Conversation
There are CPUs with vector extensions other than x86 and x86_64. A support for optimized calculations for them is based on compiler intrinsics. So, this support is added. Also, this support is explicitly forced for Elbrus (E2K) processors.
eah_HierarchSearchGCT_SSE2: Force SSE2 support, Implementation (intrinsics / x86_64 assembler) chosen automatically. eah_HierarchSearchGCT_SSE2_NOOPT: Force generic C code + ivdep for SSE2. eah_HierarchSearchGCT_SSE2_INTRIN: Force SSE2 code based on intrinsics.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I can't make pull requests to main LIGO git (https://git.ligo.org/lscsoft/lalsuite), so I make a pull request here. It is surely pushable to the main repository though.
In GCT app, there is SSE code written in x86_64 assembler; but vector extensions are available on some other CPUs, like Elbrus-8S and Elbrus-8SV (https://en.wikipedia.org/wiki/Elbrus-8S). So I added an implementation based on compiler instrinsics (and some specific targets for eah_HierarchSearchGCT binary to make this implementation buildable without modifying the Makefile), to make E@H application be able to be built and run efficiently (I achiveved about 3 times performance increase) on platforms that do not support x86_64 assembler (especially Elbrus).