Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement frozen object heap #94515

Merged
merged 8 commits into from
Nov 13, 2023
Merged

Conversation

MichalStrehovsky
Copy link
Member

When allocating a RuntimeType instances, we were creating an object on the pinned object heap, creating a handle to it, and purposefully leaked the handle. The RuntimeTypes live forever. This fragments the pinned object heap. So instead of doing that, port frozen object heap from CoreCLR. This is a line-by-line port. Frozen object heap is a segmented bump memory allocator that interacts with the GC to tell it the boundaries of the segments.

cc @dotnet/ilc-contrib

When allocating a RuntimeType instances, we were creating an object on the pinned object heap, creating a handle to it, and purposefully leaked the handle. The RuntimeTypes live forever. This fragments the pinned object heap. So instead of doing that, port frozen object heap from CoreCLR. This is a line-by-line port. Frozen object heap is a segmented bump memory allocator that interacts with the GC to tell it the boundaries of the segments.
@ghost
Copy link

ghost commented Nov 8, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

When allocating a RuntimeType instances, we were creating an object on the pinned object heap, creating a handle to it, and purposefully leaked the handle. The RuntimeTypes live forever. This fragments the pinned object heap. So instead of doing that, port frozen object heap from CoreCLR. This is a line-by-line port. Frozen object heap is a segmented bump memory allocator that interacts with the GC to tell it the boundaries of the segments.

cc @dotnet/ilc-contrib

Author: MichalStrehovsky
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

@ghost ghost assigned MichalStrehovsky Nov 8, 2023
@EgorBo
Copy link
Member

EgorBo commented Nov 8, 2023

Cool! Will it make sense to have a single managed impl and call that from CoreCLR runtime? 🙂

@MichalStrehovsky
Copy link
Member Author

Cool! Will it make sense to have a single managed impl and call that from CoreCLR runtime? 🙂

I'm not even sure this can be implemented in C# for native AOT. It was crashing without turning UpdateFrozenSegment into a DllImport instead of an InternalCall. But I just really hate C++.

@EgorBo
Copy link
Member

EgorBo commented Nov 8, 2023

Normally, allocators require NoGC basically, but it should not be a problem here I presume? RegisterFrozenSegment and UpdateFrozenSegment both take/wait GC lock inside and the C++ impl always switches to PREEMPTIVE if this is important, InternalCall means you call it with COOP

@EgorBo
Copy link
Member

EgorBo commented Nov 8, 2023

Also, I presume we won't benefit in codegen from objects allocated dynamically on FOH in NativeAOT so, I guess, you're only improving fragmentation in GC for RuntimeType objects, right?

@VSadov
Copy link
Member

VSadov commented Nov 8, 2023

I wonder if this can be better solved if GC had an API to allocate immortal objects.

Like in: GC_ALLOC_IMMORTAL flag

Immortal objects are just objects that GC allocates in a region that will never be swept or compacted. Otherwise it is a real managed object. It is in GC heap range and can be covered by cardtable/barriers and participate in marking/updating, thus even allowing writing managed references (if supporting such is interesting).

This should not be confused with External objects - where someone formats random memory to have objects in it. Those cannot be truly "managed" - these are what NativeAOT had for a long time.
But Immortal objects are managed objects, allocated on the managed heap, except that they are "forever leaked"

Also unlike External objects, immortal allocations can easily have a public API, implemented on Mono, etc...

I do not think we should keep reimplementing something that GC could do better.
I can write up a more detailed proposal if there is interest.

@EgorBo
Copy link
Member

EgorBo commented Nov 8, 2023

I wonder if this can be better solved if GC had an API to allocate immortal objects.

Like in: GC_ALLOC_IMMORTAL flag

Immortal objects are just objects that GC allocates in a region that will never be swept or compacted. Otherwise it is a real managed object. It is in GC heap range and can be covered by cardtable/barriers and participate in marking/updating, thus even allowing writing managed references.

This should not be confused with External objects - where someone formats random memory to have objects in it. Those cannot be truly "managed" - these are what NativeAOT had for a long time. But Immortal objects are managed objects in any possible way, except that they are "forever leaked"

Also unlike External objects, immortal allocations can easily have a public API attached to it, implemented on Mono, etc...

I do not think we should keep reimplementing something that GC could do better. I can write up a more detailed proposal if there is interest.

What exactly that would solve? We'll have to mark-n-sweep that new kind of immortal heap during gen1 collect and won't be able to omit write barriers for better codegen, right? Also, I presume, Mono will have to implement it separately anyway.
I noticed some interest from community in immortal objects mainly to avoid spending time in GC in those, e.g. #94411 (and have more examples on social networks) and/or mmap huge graphs of objects from file on start - but that is not trivial for a public API, right.

PS: as it turns out, new heaps are expensive to add in terms of diagnostics work 😕

@VSadov
Copy link
Member

VSadov commented Nov 8, 2023

What exactly that would solve?

A uniform way of doing this via well defined API.
In cases where GC implementation is shared, the implementation sharing as an additional benefit.

We'll have to mark-n-sweep that new kind of immortal heap during gen1 collect and won't be able to omit write barriers for better codegen, right?

There is no sweeping in this heap, since nothing dies.

Marking/cardtable/barriers stuff only needs to be involved if reference fields are allowed. That is not a necessary feature, just a possibility.
It would be a pay-for-play though. Immortal objects would be Gen2 or higher. In ephemeral collections you'd only visit locations dirtied in the card table. If there were no writes that rooted Gen1/0 objects, you'd have no reason to look at the immortal region.

PS: as it turns out, new heaps are expensive to add in terms of diagnostics work 😕

This assumes that ad-hoc immortal allocators do not need diagnostic support or that adding a new heap is the only way to implement.

In the absence of references the "implementation" could as well be "put the CLR implementation behind GC interface".
You would still benefit from sharing everything with NativeAOT, and with other users if this becomes public API.

@jkotas
Copy link
Member

jkotas commented Nov 8, 2023

We had a long discussion about whether management and reporting of frozen heaps should be owned by the GC or kept external to the GC. The preference of the GC team was to make it external - see the ASCII art at the top of dotnet/diagnostics#4156 .

I agree with you that letting GC manage frozen heaps would be a viable alternative that can be extended further to allow frozen objects with references to non-frozen ones, but it is not where we landed.

@VSadov
Copy link
Member

VSadov commented Nov 8, 2023

Oh, well.

Just to be sure - the suggestion is not about External objects. Or what is "Non-GC heap" in the ASCII art.
Things that were never allocated via GC interfaces would still live there. Supporting pre-allocated objects like NativeAOT strings and statics would need that.

This is mostly about an API that allows to allocate a forever object at run time without fear of fragmenting anything. There is clearly a need for this as this is the second time we implement this functionality.

@EgorBo
Copy link
Member

EgorBo commented Nov 8, 2023

This is mostly about an API that allows to allocate a forever object at run time without fear of fragmenting anything. There is clearly a need for this as this is the second time we implement this functionality.

But doesn't it mean that we'll need both anyway and the only advantage we can have from immortal ones that we're less constrained on what we can allocate there? (although, we still can't put stuff from collectible assemblies etc) - I wonder what exactly we'd be able to allocate there that we currently can not?

A public API for it has a questionable UX I guess - only for arrays?

@jkotas
Copy link
Member

jkotas commented Nov 8, 2023

This is mostly about an API that allows to allocate a forever object at run time without fear of fragmenting anything.

I understand. FWIW, the heap design discussion was centered around the needs of regular CoreCLR that only allocates the frozen objects at runtime.

I wonder what exactly we'd be able to allocate there that we currently can not?

I think the primary point of the discussion is what makes most sense as the overall system architecture. We can do all the things discussed here today. It may not have the 100% ideal perf characteristics, but perf is something that can be always fine-tuned based on data if the system was architected correctly.

A public API for it has a questionable UX I guess - only for arrays?

Yes, it is same set of considerations as why we have GC.AllocateArray(....) and we have not added GC.AllocateObject (yet).

public static FrozenObjectHeapManager Instance = new FrozenObjectHeapManager();

private readonly LowLevelLock m_Crst = new LowLevelLock();
private readonly LowLevelLock m_SegmentRegistrationCrst = new LowLevelLock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This complicated locking scheme is not needed for managed implementation. The complicated locking scheme is needed for the object enumeration diagnostic APIs. It is unlikely we would be able to implement the object enumeration diagnostic API using this managed implementation.

I understand why you started by trying to match the C++ impl in CoreCLR, but I am not convinced that it is helps at the end. It makes things more complicated than they need to be.

…ime/FrozenObjectHeapManager.cs

Co-authored-by: Jan Kotas <[email protected]>
@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Nov 10, 2023

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

The x64 Alpine leg is failing with the same failure mode we saw in #94405 (comment). Filed an issue on that. It's unrelated to either of those PRs and must exist in baseline.

@MichalStrehovsky MichalStrehovsky merged commit 82a8579 into dotnet:main Nov 13, 2023
183 of 190 checks passed
@MichalStrehovsky MichalStrehovsky deleted the frozenheap branch November 13, 2023 07:00
@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants