Skip to content

Conversation

@earthling-amzn
Copy link
Contributor

@earthling-amzn earthling-amzn commented Oct 24, 2025

When GenShen is only marking the old generation, we do not need the SATB mechanism to preserve young pointers. We currently filter these out of the SATB buffers during the final-update-refs and init-mark safepoints. This increases latency and introduces no small amount of complexity. It should be possible to instead filter out these pointers when the SATB buffers are 'compacted' before being 'completed'.

Background

When GenShen is marking the old generation it leaves the SATB barrier enabled. When a young collection interrupts old marking, it creates a situation where a mutator thread could overwrite a field holding a pointer into a collection set region. The SATB barrier will dutifully place this object in the SATB queue. If this pointer makes it into a mark queue, the marking thread will crash. Prior to this change, GenShen filtered out such pointers after the thread local SATB buffers were completed. After this change, such pointers are filtered out before the buffers are completed. This is more inline with the natural way of things.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8370041: GenShen: Filter young pointers from thread local SATB buffers when only marking old (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27983/head:pull/27983
$ git checkout pull/27983

Update a local copy of the PR:
$ git checkout pull/27983
$ git pull https://git.openjdk.org/jdk.git pull/27983/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27983

View PR using the GUI difftool:
$ git pr show -t 27983

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27983.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 24, 2025

👋 Welcome back wkemper! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 24, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Oct 24, 2025

@earthling-amzn The following labels will be automatically applied to this pull request:

  • hotspot-gc
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 24, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 24, 2025

Webrevs

Copy link
Contributor

@kdnilsen kdnilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great simplification. Do we have any performance numbers, especially for reduction in p99.999 and p100 latencies with certain Extremem workloads, which I believe to be related to safepoint flushing of satb buffers?

// be in the collection set. If this happens, the pointer will be preserved, essentially
// becoming part of the old snapshot.
// 2. The region is allocated during evacuation of old. This is also not a concern because
// we haven't yet finished marking old so no mixed evacuations will happen.
Copy link
Contributor

@kdnilsen kdnilsen Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth mentioning that there may be some additional analysis and code required when we use the forwarding table to recycle cset regions during evacuation and/or updating. If one of these regions becomes old before the SATB buffers have been flushed, then a young cset pointer that lingers in a SATB buffer will "all of a sudden" look like a valid old pointer and will not be purged from the SATB buffer. When we subsequently scan the "object" referenced by this obsolete pointer, we are likely to find "garbage memory", possibly resulting in a crash.

I am thinking that an initial fix might be to do this flushing at init-update-refs instead of at final update refs, and to not recycle cset regions until evacuation is all done. Is there a different handshake there that we might piggyback on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a good point about recycling regions concurrently. I don't think we can flush before init-update-refs because forwarded pointers still exist and the SATB barrier doesn't try to resolve them. That is to say, "bad" pointers can be created throughout the update-refs phase.

Even in the (hypothetical) scenario with a forwarding table, a region would only become old through an in-place-promotion (in which case, it will never have been recycled) or we will be doing a mixed evacuation. If we are running a mixed evacuation, we will already have finished marking old.

@earthling-amzn
Copy link
Contributor Author

I have results from an earlier version of this PR that flushed the buffers during init-mark. That version showed a consistent 3% improvement on critical and max jops. They also showed a reduction in p99.999 latency for some of the extremem measurements. I will retest with this version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants