Skip to content

SLING-12519 - Remove Double-Checked Locking in ScriptableBase.java#10

Open
scottyuancoc wants to merge 11 commits intoapache:masterfrom
scottyuancoc:bugfix/SLING-12519
Open

SLING-12519 - Remove Double-Checked Locking in ScriptableBase.java#10
scottyuancoc wants to merge 11 commits intoapache:masterfrom
scottyuancoc:bugfix/SLING-12519

Conversation

@scottyuancoc
Copy link
Copy Markdown
Contributor

* Eliminated the use of double-checked locking in wrapper/ScriptableBase.java to improve code safety and readability.

    * Eliminated the use of double-checked locking in wrapper/ScriptableBase.java to improve code safety and readability.
@rombert
Copy link
Copy Markdown
Contributor

rombert commented Dec 10, 2024

Thank you for the PR @scottyuancoc . Sorry for taking so long to review, I had to reaquaint myself with the topic. I agree this is an issue, although I have not seen any ill side effects in practice.

Reading through https://www.cs.cornell.edu/courses/cs6120/2019fa/blog/double-checked-locking/ your solution would fall into the Always-synchronised solution is slow category so I would not merge it as-is without understanding the performance implications.

I know @joerghoh has been looking into performance-related topics recently, maybe he has some ideas about the impact?

@joerghoh
Copy link
Copy Markdown

I am aware of a few instances of problems with javascript-based Sling Models, which we investigated but never come to a good conclusion. These could indeed be caused by the problems of double locking (see Double-checked locking is broken, thanks @rombert for that link). That means that marking the entire method as synchronized could prevent this.

Regarding performance impact: I cannot say much about that, as I have not experience with that code in production usage. But to avoid marking the entire method as synchronized we have to update njo in an atomic way, and therefor turn the definition into something like

AtomicReference<NativeJavaObject> njo;

Copy link
Copy Markdown
Contributor

@rombert rombert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments @joerghoh . I think extending the synchronized block to the method level is too coarse-grained.

I think the minimal change would be to make the njo field volatile.

@scottyuancoc - do you have a way of running some performance tests for this change? A micro-benchmark with https://github.com/openjdk/jmh might help figure out the right approach.

@sonarqubecloud
Copy link
Copy Markdown

@scottyuancoc scottyuancoc requested a review from rombert July 29, 2025 22:34
@scottyuancoc
Copy link
Copy Markdown
Contributor Author

Hello @rombert and @joerghoh, I was behind because regular works. Just saw your comments and will adjust the code.

@rombert
Copy link
Copy Markdown
Contributor

rombert commented Jul 30, 2025

Thanks for coming back to this @scottyuancoc . You re-requested a review but your change is the same as the original one, after merging with master. Please re-request once you have addressed the PR comments/reviews.

Copy link
Copy Markdown
Contributor

@rombert rombert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving to 'request changes' until comments are addressed.

@scottyuancoc scottyuancoc force-pushed the bugfix/SLING-12519 branch 2 times, most recently from 6f39dde to adc4462 Compare August 1, 2025 06:05
@scottyuancoc
Copy link
Copy Markdown
Contributor Author

Hello @rombert and @joerghoh,

Thank you for sharing the references and your insights.

After reviewing the materials you provided, along with additional sources including:

Based on these, I tested the following Java implementation using the volatile keyword to ensure thread-safe lazy initialization:

// Use a temporary variable to reduce the number of reads of the volatile field.
NativeJavaObject tempNjo = njo;
if (tempNjo == null) {
synchronized (this) {
tempNjo = njo;
if (tempNjo == null) {
njo = tempNjo = new NativeJavaObject(start, wrapped, getStaticType());
}
}
}
return tempNjo.get(name, start);

This implementation uses the double-checked locking idiom to ensure thread safety. However, it triggers SonarQube rule java:S3077. If we choose to adopt this pattern, we may need to suppress the warning or document a rationale for its use.

As an alternative—as suggested by @joerghoh—we can use an AtomicReference with a special NULL_OBJECT placeholder. This intermediate state helps avoid repeated instantiation of NativeJavaObject in each compareAndSet(...) operation.

To compare performance, I ran mvn test with JDK 21 across ten iterations for each of the four approaches. The average execution times were:

  1. AtomicReference with NULL_OBJECT placeholder: 9.2823 seconds
  2. Synchronized method approach: 9.3049 seconds ~0.243% slower
  3. AtomicReference without NULL_OBJECT: 9.3143 seconds ~0.345% slower
  4. Volatile double-checked locking: 9.4039 seconds ~1.310% slower

These results were somewhat surprising, as the volatile double-check idiom is generally expected to offer better performance. However, repeated tests produced consistent trends. While this is not a definitive benchmark, it serves as a useful approximation.

Based on these findings, the current PR adopts the AtomicReference with NULL_OBJECT placeholder approach for its balance of clarity, safety, and performance—while avoiding issues flagged by static analysis tools.

From a practical standpoint, in my use case, this module is part of a Sling-based solution that powers high-profile websites developed by JavaScript teams. Reliability is a critical success factor in such environments, and improving the robustness of this component can have a meaningful impact.

I hope this analysis is helpful to the change review process and supports our decision-making.

@scottyuancoc scottyuancoc requested a review from rombert August 1, 2025 06:28
@scottyuancoc scottyuancoc force-pushed the bugfix/SLING-12519 branch 3 times, most recently from 4acaaba to 9f51abb Compare August 4, 2025 01:13
@scottyuancoc
Copy link
Copy Markdown
Contributor Author

Hello @rombert,

I’ve spent some additional time evaluating the options and have ultimately decided to use the double-checked locking pattern with AtomicReference, rather than the approach using a NULL_OBJECT as an intermediate state.

The reason for this change is that the NULL_OBJECT approach isn't fully thread-safe in all scenarios. The revised implementation ensures proper thread safety while still offering reasonable performance.

Thanks again for the valuable discussion and input. Let me know if this approach looks good to you.

@rombert
Copy link
Copy Markdown
Contributor

rombert commented Aug 4, 2025

Thanks for the update @scottyuancoc . The changes overall LGTM, but see @joerghoh's comments on SLING-12873 where he mentions that the change might not be needed at all.

It would be good to converge the discussion in a single place, maybe here?

* Replaced with a simpler, non-thread-safe version to improve readability, as current acccess is already confined to a single thread.
@scottyuancoc
Copy link
Copy Markdown
Contributor Author

Hello @rombert,

I missed the earlier comment from @joerghoh regarding the potential issue in the ScriptableBase code (see SLING-12873). It’s definitely something that should be addressed.

That said, I agree it makes sense to keep getNative() as a single-threaded access implementation for now. I’ve pushed an update to ensure getNative() remains single-threaded, and I’ve added a comment in the code to document this behavior.

@joerghoh, thank you for reporting the finding. I am looking at providing another PR for SLING-12873 with test cases to verify the situation. Do feel free to forward the task over if it make sense.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Aug 7, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants