-
Notifications
You must be signed in to change notification settings - Fork 214
Implement a change in IL API to use RuntimeHelpers.Await<T>(Task<T>) and similar helpers. #2951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/tests/async/returns.cs
Outdated
AssertEqual("B", strings.B); | ||
AssertEqual("C", strings.C); | ||
AssertEqual("D", strings.D); | ||
// TODO: need to fix this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakobbotsch the change stresses calling via thunks and possibly introduced some scenarios that tests did not cover before. Remarkably, nearly everything works fine!! However, here I saw an assert and turned off one scenario.
Not sure if this is something wrong with IL or something on the JIT side.
(the other disabled case is with thunks for async methods in structs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've hit that before when encoding method/type spec tokens incorrectly. Can you verify that the tokens being encoded when we construct the IL for the variants look fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure does not happen with optimization enabled, so it is not caused by that.
(i.e. by default this test passes, but with set DOTNET_JitOptimizeAwait=0
I see a failure).
I'll try to take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the async resumption stub in this case needs to box the result, since the result is an obj-containing struct.
We are making a resumption stub for RuntimeHelpers.Await<T>(Task<T>)
when T
is S<string>
, thus the sig reports the result type as S'1<System.__Canon>
and we emit Box instruction with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a general issue with shared generics and nongeneric resumption stub that may need to box the result.
I can hit this in shared-generics.cs
test by adding scenario like:
(regardless of Await optimization)
Async1EntryPoint<S<string>>(typeof(S<string>), new S<string> {t = "ghj" }).Wait();
. . .
struct S<T>
{
public T t;
}
not sure yet how to deal with this.
It does not seem like the fix would be in the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can this be uncommented then?
Can you also open an issue about the unhandled case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logged #3010
It would be nice to start on this work to see how it would look before we make the switch. Note that most of the work will be VM work -- teaching |
The optimization would need to detect the following pattern
and turn it into
For that there should be a way to:
Is this correct? for
for
|
There are a few ways to do this, but maybe the most straightforward will be to do it as a direct IL pattern match at the point where we call runtimelab/src/coreclr/jit/importer.cpp Lines 8952 to 8957 in b077e29
This would be changed to first look ahead for another There are some other details to work out, like properly setting up for opportunistic tailcalls when the
I would skip
|
Yes. I included the arguments in the example to show that they do not need to change. I was thinking of looking back at previous instruction once we see an Await intrinsic, and if previous instruction was a call that we can optimize, replace it with a call to async method. The rest makes sense. Thanks! |
Implemented the JIT optimization as discussed above. |
…benchmarking purposes)
The impact of the optimization is quite noticeable (as expected):
178 ms is definitely an improvement over 1173 ms. |
src/coreclr/vm/jitinterface.cpp
Outdated
if (flags & CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT) | ||
{ | ||
_ASSERTE(!pMD->IsAsync2Method()); | ||
pMD = pMD->GetAsyncOtherVariant(); | ||
pResolvedToken->hMethod = (CORINFO_METHOD_HANDLE)pMD; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that's much simpler than I was expecting.
pResolvedToken
is in-only, so we should make a copy of it here and change that one instead. If necessary you can update it from the callInfo
on the JIT side, but I'm somewhat worried we end up with a token whose fields are internally inconsistent.
Can you make sure we have tests for some of the hard cases? GVMs, interface calls, virtual class calls and constrained calls come to mind. I was expecting shared generics to require more work as well since other fields of the token are used below for those (see ComputeRuntimeLookupForSharedGenericToken
). Can you double check why it works out? Is the method spec/type spec ok to reuse as-is from the token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to ensure the resolved token consistency could be to pass the new flag not to the eeGetCallInfo
, but to the impResolveToken
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the MethodDesc shimming to the level of impResolveToken
. That seems nicer as it allows eeGetCallInfo
to stay unchanged.
@@ -586,6 +586,8 @@ OPT_CONFIG_INTEGER(JitDoIfConversion, "JitDoIfConversion", 1) | |||
OPT_CONFIG_INTEGER(JitDoOptimizeMaskConversions, "JitDoOptimizeMaskConversions", 1) // Perform optimization of mask | |||
// conversions | |||
|
|||
RELEASE_CONFIG_INTEGER(JitOptimizeAwait, "JitOptimizeAwait", 1) // Perform optimization of Await intrinsics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't add a release knob for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this knob for two reasons:
- we claim the optimization is optional, but I found couple cases which would fail without it. I think the scenarios are reachable in actual code, but might be corner cases and we did not have such code in tests. Disabling the optimization allows to do extra "stress" for calling via thunks.
- I wanted to see the impact of the optimizations on benchmarks.
I think both needs are temporary and we will not need the knob in the long run.
I think #2
requires that it is a release knob. Is that correct? Otherwise managed runtime parts will have asserts.
I assumed that is the reason why a bunch of other optimization related knobs are release knobs.
In any case, I`d prefer to log a tracking issue to remove this knob eventually. It seems useful right now, but at some point there will be no need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's have it for now but make sure we remove it before any form of release that would lock its existence in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Jakob Botsch Nielsen <[email protected]>
src/coreclr/jit/importer.cpp
Outdated
_impResolveToken(CORINFO_TOKENKIND_Await); | ||
// consume the extra call | ||
codeAddr += 1 + sizeof(mdToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this properly handle the situation of dotnet/roslyn#76872 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It probably does not. I will add a test and make sure this works.
src/tests/async/struct.cs
Outdated
Async().Wait(); | ||
// TODO: need to fix this | ||
|
||
// Hits an assert around: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another case that works with Await folding and trips over if optimization is disabled.
Like with the other TODO, this case can be hit regardless of optimization with directed scenario.
I will uncomment and log an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return arg; | ||
} | ||
|
||
// TODO: switch every other scenario to use ValueTask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is blocked by #3010
// generic identity | ||
Assert.Equal(6, await sIdentity(sProp)); | ||
|
||
// await(await ...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this indirectly tests awaiting something that returns T, which just happens to be a task by substitution.
It is ok to use Await in this case now, but will not be optimized into a direct rtAsync call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to document this, I think. Perhaps one day we may even optimize this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this would be interesting to optimize via PGO as some form of guarded deabstraction. I have been thinking about the same thing for delegate calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks!! |
de4719c
into
dotnet:feature/async2-experiment
…and similar helpers. (dotnet#2951) * T RuntimeHelpers.Await<T>(Task<T>) * state machine version of Await and friends * bump roslyn ref * more Await helpers * remove no longer needed test * undo no longer needed metasig * comment * formatting * comment * implements JIT optimization for Await intrinsics * make the JitOptimizeAwait switch RELEASE_CONFIG_INTEGER (for testing/benchmarking purposes) * isIntrinsic * CORINFO_TOKENKIND_Await * revert CORINFO_CALLINFO_RUNTIMEASYNC_VARIANT * undo unnecessary diff * Apply suggestions from code review Co-authored-by: Jakob Botsch Nielsen <[email protected]> * one more unnecessary `return` * uncomment ReturnsStructGC scenario. * uncomment struct testcase * awaiting things that are not async --------- Co-authored-by: Jakob Botsch Nielsen <[email protected]>
This is the actual implementation of what was proposed in dotnet/runtime#110420 and prototyped in #2941
Basically, this changes
await
marker to be just a call via a specialAwait
helper.When user writes inside a runtime async method
C# compiler emits an equivalent of
The
T Await<T>(Task<T> arg)
method is a special intrinsic method that performs asynchronous awaiting of theTask<int>
.NOTE: There is no
sync-over-async
here,Await
can optionally suspend/resume the current stack of calls and when theTask<int>
is complete, unwraps it and returnsint
.Also, the JIT is familiar with the pattern and can further optimize it into call-with-continuation invocation of the runtime-async entry point for
ReturnsTaskOfInt()
.As a result, if
ReturnsTaskOfInt
is another runtime-async method, we skip intermediate promise types (Task/ValueTask) entirely, which is the main reason for the performance edge of runtime async over the classic async.