Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64: Implement SVE APIs #99957

Closed
kunalspathak opened this issue Mar 19, 2024 · 12 comments
Closed

Arm64: Implement SVE APIs #99957

kunalspathak opened this issue Mar 19, 2024 · 12 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support
Milestone

Comments

@kunalspathak
Copy link
Member

kunalspathak commented Mar 19, 2024

Now that all the SVE instructions encoding is completed in #94549, it is time to expose these instructions through .NET APIs. Here is the list of categorized APIs with links to the issue where they were approved.

.NET 9 Goal: We aim to complete SVE APIs in .NET 9. SVE2 APIs will be pushed out to .NET 10.

SVE APIs

High Priority SVE APIs

Sve mask (Complete)

Full list

Sve bitwise (Complete)

Full list

Sve bitmanipulate (Complete)

Full list

Sve loads (Complete)

Full list

Sve stores (Complete)

Full list

Sve maths (Complete)

Full list

Sve counting (Complete)

Full list

Low Priority SVE APIs

Sve scatterstores (Complete)

Full list

Sve gatherloads (Complete)

Full list

Sve fp (Complete)

Full list

Sve firstfaulting (Complete)

Full list

SVE2 APIs

Full list

Sve2 scatterstores

  • Scatter16BitNarrowing
  • Scatter16BitWithByteOffsetsNarrowing
  • Scatter32BitNarrowing
  • Scatter32BitWithByteOffsetsNarrowing
  • Scatter8BitNarrowing
  • Scatter8BitWithByteOffsetsNarrowing
  • ScatterNonTemporal

Sve2 maths

  • AbsoluteDifferenceAdd
  • AbsoluteDifferenceAddWideningLower
  • AbsoluteDifferenceAddWideningUpper
  • AbsoluteDifferenceWideningLower
  • AbsoluteDifferenceWideningUpper
  • AddCarryWideningLower
  • AddCarryWideningUpper
  • AddHighNarowingLower
  • AddHighNarowingUpper
  • AddPairwise
  • AddPairwiseWidening
  • AddSaturate
  • AddSaturateWithSignedAddend
  • AddSaturateWithUnsignedAddend
  • AddWideLower
  • AddWideUpper
  • AddWideningLower
  • AddWideningLowerUpper
  • AddWideningUpper
  • DotProductComplex
  • HalvingAdd
  • HalvingSubtract
  • HalvingSubtractReversed
  • MaxNumberPairwise
  • MaxPairwise
  • MinNumberPairwise
  • MinPairwise
  • MultiplyAddBySelectedScalar
  • MultiplyAddWideningLower
  • MultiplyAddWideningUpper
  • MultiplyBySelectedScalar
  • MultiplySubtractBySelectedScalar
  • MultiplySubtractWideningLower
  • MultiplySubtractWideningUpper
  • MultiplyWideningLower
  • MultiplyWideningUpper
  • PolynomialMultiply
  • PolynomialMultiplyWideningLower
  • PolynomialMultiplyWideningUpper
  • RoundingAddHighNarowingLower
  • RoundingAddHighNarowingUpper
  • RoundingHalvingAdd
  • RoundingSubtractHighNarowingLower
  • RoundingSubtractHighNarowingUpper
  • SaturatingAbs
  • SaturatingDoublingMultiplyAddWideningLower
  • SaturatingDoublingMultiplyAddWideningLowerUpper
  • SaturatingDoublingMultiplyAddWideningUpper
  • SaturatingDoublingMultiplyHigh
  • SaturatingDoublingMultiplySubtractWideningLower
  • SaturatingDoublingMultiplySubtractWideningLowerUpper
  • SaturatingDoublingMultiplySubtractWideningUpper
  • SaturatingDoublingMultiplyWideningLower
  • SaturatingDoublingMultiplyWideningUpper
  • SaturatingNegate
  • SaturatingRoundingDoublingMultiplyAddHigh
  • SaturatingRoundingDoublingMultiplyHigh
  • SaturatingRoundingDoublingMultiplySubtractHigh
  • SubtractHighNarowingLower
  • SubtractHighNarowingUpper
  • SubtractSaturate
  • SubtractSaturateReversed
  • SubtractWideLower
  • SubtractWideUpper
  • SubtractWideningLower
  • SubtractWideningLowerUpper
  • SubtractWideningUpper
  • SubtractWideningUpperLower
  • SubtractWithBorrowWideningLower
  • SubtractWithBorrowWideningUpper

Sve2 mask

  • CreateWhileGreaterThanMask
  • CreateWhileGreaterThanOrEqualMask
  • CreateWhileReadAfterWriteMask
  • CreateWhileWriteAfterReadMask
  • Match
  • NoMatch
  • SaturatingExtractNarrowingLower
  • SaturatingExtractNarrowingUpper
  • SaturatingExtractUnsignedNarrowingLower
  • SaturatingExtractUnsignedNarrowingUpper

Sve2 gatherloads

  • GatherVectorByteZeroExtendNonTemporal
  • GatherVectorInt16SignExtendNonTemporal
  • GatherVectorInt16WithByteOffsetsSignExtendNonTemporal
  • GatherVectorInt32SignExtendNonTemporal
  • GatherVectorInt32WithByteOffsetsSignExtendNonTemporal
  • GatherVectorNonTemporal
  • GatherVectorSByteSignExtendNonTemporal
  • GatherVectorUInt16WithByteOffsetsZeroExtendNonTemporal
  • GatherVectorUInt16ZeroExtendNonTemporal
  • GatherVectorUInt32WithByteOffsetsZeroExtendNonTemporal
  • GatherVectorUInt32ZeroExtendNonTemporal

Sve2 fp

  • AddRotateComplex
  • DownConvertNarrowingUpper
  • DownConvertRoundingOdd
  • DownConvertRoundingOddUpper
  • Log2
  • MultiplyAddRotateComplex
  • MultiplyAddRotateComplexBySelectedScalar
  • ReciprocalEstimate
  • ReciprocalSqrtEstimate
  • SaturatingComplexAddRotate
  • SaturatingRoundingDoublingComplexMultiplyAddHighRotate
  • UpConvertWideningUpper

Sve2 counting

  • CountMatchingElements
  • CountMatchingElementsIn128BitSegments

Sve2 bitwise

  • BitwiseClearXor
  • BitwiseSelect
  • BitwiseSelectLeftInverted
  • BitwiseSelectRightInverted
  • ShiftArithmeticRounded
  • ShiftArithmeticRoundedSaturate
  • ShiftArithmeticSaturate
  • ShiftLeftAndInsert
  • ShiftLeftLogicalSaturate
  • ShiftLeftLogicalSaturateUnsigned
  • ShiftLeftLogicalWideningEven
  • ShiftLeftLogicalWideningOdd
  • ShiftLogicalRounded
  • ShiftLogicalRoundedSaturate
  • ShiftRightAndInsert
  • ShiftRightArithmeticAdd
  • ShiftRightArithmeticNarrowingSaturateEven
  • ShiftRightArithmeticNarrowingSaturateOdd
  • ShiftRightArithmeticNarrowingSaturateUnsignedEven
  • ShiftRightArithmeticNarrowingSaturateUnsignedOdd
  • ShiftRightArithmeticRounded
  • ShiftRightArithmeticRoundedAdd
  • ShiftRightArithmeticRoundedNarrowingSaturateEven
  • ShiftRightArithmeticRoundedNarrowingSaturateOdd
  • ShiftRightArithmeticRoundedNarrowingSaturateUnsignedEven
  • ShiftRightArithmeticRoundedNarrowingSaturateUnsignedOdd
  • ShiftRightLogicalAdd
  • ShiftRightLogicalNarrowingEven
  • ShiftRightLogicalNarrowingOdd
  • ShiftRightLogicalRounded
  • ShiftRightLogicalRoundedAdd
  • ShiftRightLogicalRoundedNarrowingEven
  • ShiftRightLogicalRoundedNarrowingOdd
  • ShiftRightLogicalRoundedNarrowingSaturateEven
  • ShiftRightLogicalRoundedNarrowingSaturateOdd
  • Xor
  • XorRotateRight

Sve2 bitmanipulate

  • InterleavingXorLowerUpper
  • InterleavingXorUpperLower
  • MoveWideningLower
  • MoveWideningUpper
  • VectorTableLookup
  • VectorTableLookupExtension

SveBf16

  • Bfloat16DotProduct
  • Bfloat16MatrixMultiplyAccumulate
  • Bfloat16MultiplyAddWideningToSinglePrecisionLower
  • Bfloat16MultiplyAddWideningToSinglePrecisionUpper
  • ConcatenateEvenInt128FromTwoInputs
  • ConcatenateOddInt128FromTwoInputs
  • ConditionalExtractAfterLastActiveElement
  • ConditionalExtractAfterLastActiveElementAndReplicate
  • ConditionalExtractLastActiveElement
  • ConditionalExtractLastActiveElementAndReplicate
  • ConditionalSelect
  • ConvertToBFloat16
  • CreateFalseMaskBFloat16
  • CreateTrueMaskBFloat16
  • CreateWhileReadAfterWriteMask
  • CreateWhileWriteAfterReadMask
  • DotProductBySelectedScalar
  • DownConvertNarrowingUpper
  • DuplicateSelectedScalarToVector
  • ExtractAfterLastScalar
  • ExtractAfterLastVector
  • ExtractLastScalar
  • ExtractLastVector
  • ExtractVector
  • GetActiveElementCount
  • InsertIntoShiftedVector
  • InterleaveEvenInt128FromTwoInputs
  • InterleaveInt128FromHighHalvesOfTwoInputs
  • InterleaveInt128FromLowHalvesOfTwoInputs
  • InterleaveOddInt128FromTwoInputs
  • LoadVector
  • LoadVector128AndReplicateToVector
  • LoadVector256AndReplicateToVector
  • LoadVectorFirstFaulting
  • LoadVectorNonFaulting
  • LoadVectorNonTemporal
  • Load2xVector
  • Load3xVector
  • Load4xVector
  • PopCount
  • ReverseElement
  • Splice
  • Store
  • StoreNonTemporal
  • TransposeEven
  • TransposeOdd
  • UnzipEven
  • UnzipOdd
  • VectorTableLookup
  • VectorTableLookupExtension
  • ZipHigh
  • ZipLow

SveF32mm

  • MatrixMultiplyAccumulate

SveF64mm

  • ConcatenateEvenInt128FromTwoInputs
  • ConcatenateOddInt128FromTwoInputs
  • InterleaveEvenInt128FromTwoInputs
  • InterleaveInt128FromHighHalvesOfTwoInputs
  • InterleaveInt128FromLowHalvesOfTwoInputs
  • InterleaveOddInt128FromTwoInputs
  • LoadVector256AndReplicateToVector
  • MatrixMultiplyAccumulate

SveFp16

  • Abs
  • AbsoluteCompareGreaterThan
  • AbsoluteCompareGreaterThanOrEqual
  • AbsoluteCompareLessThan
  • AbsoluteCompareLessThanOrEqual
  • AbsoluteDifference
  • Add
  • AddAcross
  • AddPairwise
  • AddRotateComplex
  • AddSequentialAcross
  • CompareEqual
  • CompareGreaterThan
  • CompareGreaterThanOrEqual
  • CompareLessThan
  • CompareLessThanOrEqual
  • CompareNotEqualTo
  • CompareUnordered
  • ConcatenateEvenInt128FromTwoInputs
  • ConcatenateOddInt128FromTwoInputs
  • ConditionalExtractAfterLastActiveElement
  • ConditionalExtractAfterLastActiveElementAndReplicate
  • ConditionalExtractLastActiveElement
  • ConditionalExtractLastActiveElementAndReplicate
  • ConditionalSelect
  • ConvertToDouble
  • ConvertToHalf
  • ConvertToInt16
  • ConvertToInt32
  • ConvertToInt64
  • ConvertToSingle
  • ConvertToUInt16
  • ConvertToUInt32
  • ConvertToUInt64
  • CreateFalseMaskHalf
  • CreateTrueMaskHalf
  • CreateWhileReadAfterWriteMask
  • CreateWhileWriteAfterReadMask
  • Divide
  • DownConvertNarrowingUpper
  • DuplicateSelectedScalarToVector
  • ExtractAfterLastScalar
  • ExtractAfterLastVector
  • ExtractLastScalar
  • ExtractLastVector
  • ExtractVector
  • FloatingPointExponentialAccelerator
  • FusedMultiplyAdd
  • FusedMultiplyAddBySelectedScalar
  • FusedMultiplyAddNegated
  • FusedMultiplySubtract
  • FusedMultiplySubtractBySelectedScalar
  • FusedMultiplySubtractNegated
  • GetActiveElementCount
  • InsertIntoShiftedVector
  • InterleaveEvenInt128FromTwoInputs
  • InterleaveInt128FromHighHalvesOfTwoInputs
  • InterleaveInt128FromLowHalvesOfTwoInputs
  • InterleaveOddInt128FromTwoInputs
  • LoadVector
  • LoadVector128AndReplicateToVector
  • LoadVector256AndReplicateToVector
  • LoadVectorFirstFaulting
  • LoadVectorNonFaulting
  • LoadVectorNonTemporal
  • LoadVectorx2
  • LoadVectorx3
  • LoadVectorx4
  • Log2
  • Max
  • MaxAcross
  • MaxNumber
  • MaxNumberAcross
  • MaxNumberPairwise
  • MaxPairwise
  • Min
  • MinAcross
  • MinNumber
  • MinNumberAcross
  • MinNumberPairwise
  • MinPairwise
  • Multiply
  • MultiplyAddRotateComplex
  • MultiplyAddRotateComplexBySelectedScalar
  • MultiplyAddWideningLower
  • MultiplyAddWideningUpper
  • MultiplyBySelectedScalar
  • MultiplyExtended
  • MultiplySubtractWideningLower
  • MultiplySubtractWideningUpper
  • Negate
  • PopCount
  • ReciprocalEstimate
  • ReciprocalExponent
  • ReciprocalSqrtEstimate
  • ReciprocalSqrtStep
  • ReciprocalStep
  • ReverseElement
  • RoundAwayFromZero
  • RoundToNearest
  • RoundToNegativeInfinity
  • RoundToPositiveInfinity
  • RoundToZero
  • Scale
  • Splice
  • Sqrt
  • Store
  • StoreNonTemporal
  • Subtract
  • TransposeEven
  • TransposeOdd
  • TrigonometricMultiplyAddCoefficient
  • TrigonometricSelectCoefficient
  • TrigonometricStartingValue
  • UnzipEven
  • UnzipOdd
  • UpConvertWideningUpper
  • VectorTableLookup
  • VectorTableLookupExtension
  • ZipHigh
  • ZipLow

SveI8mm

  • DotProductSignedUnsigned
  • DotProductUnsignedSigned
  • MatrixMultiplyAccumulate
  • MatrixMultiplyAccumulateUnsignedSigned

Sha3

  • BitwiseClearXor
  • BitwiseRotateLeftBy1AndXor
  • Xor
  • XorRotateRight

Sm4

  • Sm4EncryptionAndDecryption
  • Sm4KeyUpdates

SveAes

  • AesInverseMixColumns
  • AesMixColumns
  • AesSingleRoundDecryption
  • AesSingleRoundEncryption
  • PolynomialMultiplyWideningLower
  • PolynomialMultiplyWideningUpper

SveBitperm

  • GatherLowerBitsFromPositionsSelectedByBitmask
  • GroupBitsToRightOrLeftAsSelectedByBitmask
  • ScatterLowerBitsIntoPositionsSelectedByBitmask

SveSha3

  • BitwiseRotateLeftBy1AndXor

SveSm4

  • Sm4EncryptionAndDecryption
  • Sm4KeyUpdates

Credits to @a74nh for populating the list and also some files in https://github.com/a74nh/runtime/tree/api_github/sve_api that will help to implement them.

Contributes to #93095

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Mar 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Mar 19, 2024
@a74nh
Copy link
Contributor

a74nh commented Mar 19, 2024

Recommendation for how to implement.
Examples of this can be found in 100134

API

  • src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Sve.PlatformNotSupported.cs
  • src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Sve.cs
  • src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs

Copy/paste contents from files in https://github.com/a74nh/runtime/tree/api_github/sve_api/out_cs_api/ . There should be no need to edit these changes. Keep alphabetical ordering.

The same files have been given additional annotation and can be found in https://github.com/a74nh/runtime/tree/api_github/sve_api/out_helper_api . These are for development use only and are not for commiting.

HW Intrinsics

  • src/coreclr/jit/hwintrinsiclistarm64sve.h

Copy/paste from https://github.com/a74nh/runtime/blob/api_github/sve_api/out_hwintrinsiclistarm64sve.h
For entries with multiple instructions for a single type, this will need fixing via a special code path.
The flags and category columns will probably need manually fixing.
Flags that are not automatically detected:

  • HW_Flag_LowMaskedOperation : The predicate in arg1 is 0-7
  • HW_Flag_HasRMWSemantics : src1 and dest use the same register.
  • HW_Flag_EmbeddedMaskedOperation : APIs that have just have "predicated" version. These APIs are converted into ConditionalSelect(AllTrue, CALL_API(operands...), Zero) to get the effect of "predicate" registers. E.g. Abs, Divide.
  • HW_Flag_OptionalEmbeddedMaskedOperation : APIs that have both "predicated" and "unpredicated" version. These APIs can be used stand alone, for which "unpredicated" version of the instruction will be generated. They can also be wrapped in ConditionalSelect in a user code and in which case, "predicated" version of the instruction will be emitted. E.g. Add, Multiply, etc.
  • HW_Flag_ExplicitMaskedOperation : These APIs take "mask" explicitly as the first argument. E.g. ConditionalSelect
  • HW_Flag_Scalable : All APIs have this flag to identify that they operate on scalable vector length.
  • Any other restrictions on the register number.

For any special case where there is no flag, you have options:

  1. Add a new flag. Add code in hwintrinsics to use the flag. There is limited space for new flags, so only do this where there are many instructions that would require it.
  2. If changes need making in codegen, then mark as HW_Flag_SpecialCodeGen and add a new case to CodeGen::genHWIntrinsic().
  3. If changes need making at the import stage then mark as HW_Category_Special and add a new case to Compiler::impSpecialIntrinsic()
  4. Mark as both HW_Flag_SpecialCodeGen and HW_Category_Special

Testing

  • src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_Arm.cs

Copy/paste from https://raw.githubusercontent.com/a74nh/runtime/api_github/sve_api/out_GenerateHWIntrinsicTests_Arm.cs
Rename the template (first column) to a more generic template. We want as few new templates as possible. Existing AdvSimd templates can be copied and then edited to include extra Sve parts.
The ValidateIterResult and NextValueOpN entires will need editing to fit the template.
Use existing entires as a guide.

Linux

Tests can be build using:

rm -fr ./artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/
./src/tests/build.sh checked -test:JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro.csproj

Tests can then be run:

./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.sh

Generated C# files are in artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/

There are a lot of tests that will be run. To make life easier run the .dll directly and pass it the name of the test (a substring will do). Eg:

$CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
$CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve

Windows

Tests can be build using:

del /F /S /Q repo\artifacts\tests\coreclr\obj\windows.arm64.Release\Managed\JIT\HardwareIntrinsics\Arm\Sve\
pushd repo\src\tests\
build.cmd Release -test JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_r.csproj /p:TargetArchitecture=arm64
build.cmd Release -test JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_ro.csproj /p:TargetArchitecture=arm64

Tests can then be run:

pushd repo\artifacts\tests\coreclr\windows.arm64.Release\JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_r
HardwareIntrinsics_Arm_r.cmd

Generated C# files are in artifacts\tests\coreclr\obj\windows.arm64.Release\Managed\JIT\HardwareIntrinsics\Arm\Sve\Sve_ro\Sve_ro\gen\

There are a lot of tests that will be run. To make life easier run the .dll directly and pass it the name of the test (a substring will do). Eg:

$CORE_ROOT\corerun .\artifacts\tests\coreclr\windows.arm64.Release\JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_ro\HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
$CORE_ROOT\corerun .\artifacts\tests\coreclr\windows.arm64.Release\JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_ro\HardwareIntrinsics_Arm_ro.dll Sve

Altjit

All the testing works as usual using AltJit* environment variables. Only thing to remember is to set additional environment variable DOTNET_MaxVectorTBitWidth=128 to avoid getting asserts assert(size == info.compCompHnd->getClassSize(typeHnd));

Stress testing

All the tests should be run using all the various stress modes.
https://github.com/a74nh/runtime/blob/api_github/sve_api/stress_tester.py is used to run your test in the various modes. Pass it the full command line for running your test. Eg:

stress_tester.py $CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_Add_uint

Writing Tests

  • Once the .cs files have been created, you can edit then manually and rebuild. Copy the changes back to the template once the test works. This can save time fiddling with template params.
  • Where possible, we want to avoid calling other API calls within a test. This stops dependencies building up in the tests. For bonus points, write your test functions once without additional API calls and once with.
  • When an API call uses a mask (either input or return) the type of that mask is a vector<T>, this means you can treat it like a normal vector and itterate through it, set values etc. A mask should only contain the values 0 or 1. Within the jit it will be converted to/from a vector of boolean value so that they can be placed in the SVE predicate registers (p0 to p15).

@a74nh
Copy link
Contributor

a74nh commented Mar 19, 2024

For choosing APIs.

  • For now, only pick APIs that do not have an embedded mask (ie: Those where the Arm instruction takes in a Predicate register as arg2, but does not expose the mask at the API level. For example most of the Sve Maths methods).
    • Support for embedded masks is ongoing.
    • the helper API files indicate which methods have embedded masks with the label "Embedded arg1 mask predicate". Alternatively, see the flag HW_Flag_EmbeddedMaskedOperation in out_hwintrinsiclistarm64sve.h
  • Sve is highest priority, then Sve2, then all of the smaller extensions.
    • I recommend starting with the loads and stores

@kunalspathak kunalspathak added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI User Story A single user-facing feature. Can be grouped under an epic. labels Mar 19, 2024
@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Mar 20, 2024
@JulieLeeMSFT JulieLeeMSFT removed the User Story A single user-facing feature. Can be grouped under an epic. label Mar 22, 2024
@JulieLeeMSFT JulieLeeMSFT added this to the 9.0.0 milestone Mar 22, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Mar 22, 2024
@a74nh
Copy link
Contributor

a74nh commented Mar 25, 2024

As the testing grows it will increasingly become difficult to test just a single API. this is ok during CI, but painful during development and bug fixing.

I recommend someone writes a patch so that a testname can be passed in as an argument so that only that test will run. Eg:
HardwareIntrinsics_Arm_ro.sh Sve.Add.uint

@kunalspathak
Copy link
Member Author

I recommend someone writes a patch so that a testname can be passed in as an argument so that only that test will run. Eg: HardwareIntrinsics_Arm_ro.sh Sve.Add.uint

I agree. I have asked @TIHan to come up with a design for this. @TIHan - any update on this?

@TIHan
Copy link
Contributor

TIHan commented Mar 25, 2024

I have not looked at this yet, but can this week.

@tannergooding
Copy link
Member

Just noting such support should already exist if you invoke the underlying dll directly, this may just be something missing from the .sh file.

The exact argument that matches a filter may be a bit different due to it now using the underlying xunit filtering mechanic, but it should largely just work.


https://github.com/dotnet/runtime/blob/main/docs/workflow/testing/coreclr/testing.md#running-individual-tests

You can then see some of the logic that gets setup via https://github.com/dotnet/runtime/blob/main/src/tests/Common/XUnitWrapperGenerator/XUnitWrapperGenerator.cs and the corresponding logic of how the test filtering works here: https://github.com/dotnet/runtime/blob/main/src/tests/Common/XUnitWrapperLibrary/TestFilter.cs

The actual filter is constructed like:

System.Collections.Generic.Dictionary<string, string> testExclusionTable = XUnitWrapperLibrary.TestFilter.LoadTestExclusionTable();
XUnitWrapperLibrary.TestFilter filter = new (args, testExclusionTable);

A given TestExecutor then uses it like:

void TestExecutor1(System.IO.StreamWriter tempLogSw, System.IO.StreamWriter statsCsvSw)
{
    if (filter is null || filter.ShouldRunTest(@"JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble", "_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()"))
    {
        System.TimeSpan testStart = stopwatch.Elapsed;
        try
        {
            summary.ReportStartingTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", System.Console.Out);
            outputRecorder.ResetTestOutput();
            _AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble();
            summary.ReportPassedTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", "JIT.HardwareIntrinsics.Arm._AdvSimd.Program", @"AddDouble", stopwatch.Elapsed - testStart, outputRecorder.GetTestOutput(), System.Console.Out, tempLogSw, statsCsvSw);
        }
        catch (System.Exception ex)
        {
            summary.ReportFailedTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", "JIT.HardwareIntrinsics.Arm._AdvSimd.Program", @"AddDouble", stopwatch.Elapsed - testStart, ex, outputRecorder.GetTestOutput(), System.Console.Out, tempLogSw, statsCsvSw);
        }
    }
    else
    {
        string reason = filter.GetTestExclusionReason("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()");
        summary.ReportSkippedTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", "JIT.HardwareIntrinsics.Arm._AdvSimd.Program", @"AddDouble", System.TimeSpan.Zero, reason, tempLogSw, statsCsvSw);
    }
}

where ShouldRunTest basically just does a stringToSearch.Contains(filter) check at the most basic level

@kunalspathak
Copy link
Member Author

Just noting such support should already exist if you invoke the underlying dll directly, this may just be something missing from the .sh file.

If that's the case, can you or @TIHan can come up with the exact command line that is needed to run a particular case. I don't want engineer to hack around a test to make it working for every API.

@a74nh
Copy link
Contributor

a74nh commented Mar 25, 2024

It appears to work:

❯ $CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
16:34:55.071 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_Add_uint()
Supported ISAs:
  AdvSimd:   True
  Aes:       True
  ArmBase:   True
  Crc32:     True
  Dp:        True
  Rdm:       True
  Sha1:      True
  Sha256:    True
  Sve:       True

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
16:34:55.177 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_Add_uint()

I'm happy with this as a solution then!

@a74nh
Copy link
Contributor

a74nh commented Apr 24, 2024

Updated the implementation instructions with stress testing and how to write the tests.

@kunalspathak
Copy link
Member Author

Updated the implementation instructions with stress testing and how to write the tests.

Updated for Windows.

@JulieLeeMSFT
Copy link
Member

Closing this as completed. Will open a new issue for items that will be included in .NET 10.

@github-project-automation github-project-automation bot moved this from Team User Stories to Done in .NET Core CodeGen Aug 14, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Sep 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support
Projects
Status: Done
Development

No branches or pull requests

6 participants