Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mono AOT Benchmarks crash with PlatformNotSupportedException in IndexOfAnyAsciiSearcher.IndexOfAnyCore #106914

Closed
caaavik-msft opened this issue Aug 24, 2024 · 8 comments
Assignees
Milestone

Comments

@caaavik-msft
Copy link
Contributor

caaavik-msft commented Aug 24, 2024

Description

In the dotnet-runtime-perf pipeline when it runs the micro benchmarks on MonoAOT, any benchmarks which calls IndexOfAnyAsciiSearcher.IndexOfAnyCore internally (e.g. System.Buffers.Tests.SearchValuesCharTests.IndexOfAnyExcept) will fail with the following exception:

[2024/08/12 11:13:35][INFO] System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
[2024/08/12 11:13:35][INFO]  ---> System.PlatformNotSupportedException: Operation is not supported on this platform.
[2024/08/12 11:13:35][INFO]    at System.Buffers.IndexOfAnyAsciiSearcher.IndexOfAnyCore[Int32,Negate,Default,IndexOfAnyResultMapper`1](Int16& searchSpace, Int32 searchSpaceLength, AsciiState& state)
[2024/08/12 11:13:35][INFO]    at System.Buffers.Tests.SearchValuesCharTests.IndexOfAnyExcept()
[2024/08/12 11:13:35][INFO]    at BenchmarkDotNet.Autogenerated.Runnable_20.WorkloadActionNoUnroll(Int64 invokeCount)
[2024/08/12 11:13:35][INFO]    at BenchmarkDotNet.Engines.Engine.RunIteration(IterationData data)
[2024/08/12 11:13:35][INFO]    at BenchmarkDotNet.Engines.EngineFactory.Jit(Engine engine, Int32 jitIndex, Int32 invokeCount, Int32 unrollFactor)
[2024/08/12 11:13:35][INFO]    at BenchmarkDotNet.Engines.EngineFactory.CreateReadyToRun(EngineParameters engineParameters)
[2024/08/12 11:13:35][INFO]    at BenchmarkDotNet.Autogenerated.Runnable_20.Run(IHost host, String benchmarkName)
[2024/08/12 11:13:35][INFO]    at System.Reflection.MethodBaseInvoker.InterpretedInvoke_Method(Object obj, IntPtr* args)
[2024/08/12 11:13:35][INFO]    at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)
[2024/08/12 11:13:35][INFO]    --- End of inner exception stack trace ---
[2024/08/12 11:13:35][INFO]    at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)
[2024/08/12 11:13:35][INFO]    at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
[2024/08/12 11:13:35][INFO]    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
[2024/08/12 11:13:35][INFO]    at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters)
[2024/08/12 11:13:35][INFO]    at BenchmarkDotNet.Autogenerated.UniqueProgramName.AfterAssemblyLoadingAttached(String[] args)

This is consistently reproducible on all the machines in the Ubuntu.2204.Amd64.Tiger.Perf queue which has the following set of hardware instrinsics available according to BDN output: HardwareIntrinsics=SSE4.2,AES,BMI1,BMI2,LZCNT,PCLMUL,POPCNT VectorSize=128.

Reproduction Steps

Follow these instructions in the dotnet/performance repo to run the microbenchmarks with --filter *SearchValuesCharTests.IndexOfAnyExcept* argument set: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md#dotnet-runtime-testing-for-monoaot.

This might not be reproducible depending on the set of intrinsics available on your machine. If needed, we can help give access to a machine from Ubuntu.2204.Amd64.Tiger.Perf to debug this on as it is consistently reproducible there.

Expected behavior

The benchmark executes successfully

Actual behavior

The benchmark fails to run with the previous exception

Regression?

I believe this was introduced in #105867. The issue is seen in the "Performance linux x64 release mono AOT micro_mono perftiger NoJS False False False net9.0" job in the dotnet-runtime-perf pipeline. The issue first appeared in run 20240812.3 which indicates the following commit range: 15e96fa...96eda8d

Known Workarounds

No response

Configuration

This is reproduced when running benchmarks against versions compiled from main.

[2024/08/12 11:12:38][INFO] BenchmarkDotNet v0.13.13-nightly.20240311.145, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
[2024/08/12 11:12:38][INFO] Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
[2024/08/12 11:12:38][INFO] .NET SDK 9.0.100-rc.1.24409.1
[2024/08/12 11:12:38][INFO]   [Host]     : .NET 9.0.0 (9.0.24.40812), X64 RyuJIT AVX2
[2024/08/12 11:12:38][INFO]   Job-BSFCDS : .NET 9.0.0 (42.42.42.42424) using MonoVM, X64 SSE4.2

Other information

No response

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 24, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 24, 2024
@am11 am11 added area-Codegen-AOT-mono and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 24, 2024
@am11
Copy link
Member

am11 commented Aug 24, 2024

cc @BrzVlad

@steveisok
Copy link
Member

/cc @vitek-karas

@vitek-karas vitek-karas added this to the 9.0.0 milestone Aug 30, 2024
@vitek-karas
Copy link
Member

This seems to be a regression in 9 - so we should try to fix it. @BrzVlad could you please take a look, seems to be relatively easy to repro.

@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Aug 30, 2024
@BrzVlad
Copy link
Member

BrzVlad commented Sep 3, 2024

Was able to reproduce locally, outside of performance/bdn workflow, with a simple console app. Issue is not related to recent llvm work. I suspect investigation pointed to my PR because the aot compiler (with llvm enabled) used to crash when compiling SPC.dll, with the tests probably working just fine at runtime, falling back to jit instead.

This does seem to be a regression from .net8, likely triggered indirectly either by changes in SearchValues.cs or recent additions to x64 hardware instrinsics. The cause of the issue is mixing between llvm compiled code and jit compiled code. LLVM backend assumes we have a certain set of supported intrinsics that JIT might be lacking and later crashing. A conservative fix would be to enable additional intrinsics on llvm only if full aot is enabled. I remembered we had some other issues like this in the last year or two and we had some sort of a plan but don't remember what was our conclusion there. @fanyang-mono @lambdageek @matouskozak Do you guys remember additional details?

@matouskozak
Copy link
Member

Was able to reproduce locally, outside of performance/bdn workflow, with a simple console app. Issue is not related to recent llvm work. I suspect investigation pointed to my PR because the aot compiler (with llvm enabled) used to crash when compiling SPC.dll, with the tests probably working just fine at runtime, falling back to jit instead.

This does seem to be a regression from .net8, likely triggered indirectly either by changes in SearchValues.cs or recent additions to x64 hardware instrinsics. The cause of the issue is mixing between llvm compiled code and jit compiled code. LLVM backend assumes we have a certain set of supported intrinsics that JIT might be lacking and later crashing. A conservative fix would be to enable additional intrinsics on llvm only if full aot is enabled. I remembered we had some other issues like this in the last year or two and we had some sort of a plan but don't remember what was our conclusion there. @fanyang-mono @lambdageek @matouskozak Do you guys remember additional details?

We've seen a very similar failure a while back dotnet/performance#4088.

@tannergooding
Copy link
Member

or recent additions to x64 hardware instrinsics

There were no additions to any of the System.Runtime.Intrinsics.x86 ISAs that Mono supports between .NET 8 and .NET 9. A few were added to Avx512* and Avx10v1 is net new, but neither of those are ones that Mono supports as of yet (Mono doesn't support AVX or later, that is anything needing the VEX or EVEX encoding).

I don't, at a glance, see any changes to IndexOfAnyCore or supporting functions that might be causing a new path to be getting hit for Mono either. Most of the code for the Vector128 code path is using the xplat APIs and we're really only using Sse2.PackUnsignedSaturate for direct intrinsic usage. We also use BitOperations.LeadingZeroCount or BitOperations.TrailingZeroCount in some cases and those can themselves use X86Base.BitScanForward or X86Base.BitScanReverse or in the case of LeadingZeroCount it can also use Lzcnt.LeadingZeroCount. But Mono supports all of these today for both mini-llvm and mini-amd64.

@tannergooding
Copy link
Member

Hmmm, actually while mini-amd64 has support for Sse2.PackUnsignedSaturate, I'm not sure the right mapping between simd-intrinsics.c and mini-amd64.c exists.

Mono defines the ID has part of sse2_methods here:

{SN_PackUnsignedSaturate},

Then imports it when feature == MONO_CPU_X86_SSE2 here:

case SN_PackUnsignedSaturate:
return emit_simd_ins_for_sig (cfg, klass, OP_SSE2_PACKUS, -1, arg0_type, fsig, args);

and while mini-llvm.c then has handling for this here:

case OP_SSE2_PACKUS: {
LLVMValueRef args [2];
args [0] = convert (ctx, lhs, sse_i2_t);
args [1] = convert (ctx, rhs, sse_i2_t);
values [ins->dreg] = call_intrins (ctx, INTRINS_SSE_PACKUSWB, args, dname);
break;
}

mini-amd64.c does not handle OP_SSE2_PACKUS itself, instead it is looking for OP_PACKW_UN here:

case OP_PACKW_UN:
amd64_sse_packuswb_reg_reg (code, ins->sreg1, ins->sreg2);
break;

and I don't see anything that would create this type of op, there's only the mini-ops.h definition of it here:

MINI_OP(OP_PACKW_UN, "packw_un", XREG, XREG, XREG)

@BrzVlad
Copy link
Member

BrzVlad commented Sep 5, 2024

Some more detailed description of the root cause of this issue and potential, non-ideal fix. #107401

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants