Skip to content

Conversation

@vogella
Copy link
Contributor

@vogella vogella commented Oct 25, 2025

Summary

Problem

The test FastAbstractReconcilerTest#testReplacingDocumentWhenClean() was randomly failing on macOS (and other platforms) with a timeout error. The test expects the reconciler to complete within 5 seconds, but the reconciler thread never executed the process() method that would trigger the test's barrier synchronization.

Root Cause

The race condition occurred due to improper ordering of operations in the BackgroundWorker.reset() method. The code uses two separate locks:

  1. The BackgroundWorker instance lock (for fIsDirty and fReset)
  2. The fDirtyRegionQueue lock (for notifications and waitFinish)

Old sequence:

  1. Set fIsDirty=true and fReset=true (under BackgroundWorker lock)
  2. Notify fDirtyRegionQueueThread could wake up here!
  3. Call informNotFinished()aboutToWork()
  4. aboutToWork() calls signalWaitForFinish() which sets waitFinish=true

The reconciler thread could wake up from the notification in step 2 before waitFinish was set to true in step 4, causing it to enter a full delay period (fDelay milliseconds) instead of processing immediately.

Solution

Move the informNotFinished() call to after setting the flags but before the final queue notification:

New sequence:

  1. Set fIsDirty=true and fReset=true (under BackgroundWorker lock)
  2. Call informNotFinished()aboutToWork()signalWaitForFinish()
  3. signalWaitForFinish() sets waitFinish=true and notifies queue
  4. Final notifyAll() on queue (redundant but harmless)

This ensures that when the reconciler thread wakes up, both fIsDirty and waitFinish are already properly set, eliminating the race condition.

Testing

  • ✅ Compiled successfully with mvn clean compile -Pbuild-individual-bundles
  • ✅ Code preserves existing behavior while ensuring proper synchronization
  • ✅ Should resolve the intermittent timeout in FastAbstractReconcilerTest.testReplacingDocumentWhenClean

Related Issues

Fixes #2708

🤖 Generated with Claude Code

Fixes eclipse-platform#2708

The test FastAbstractReconcilerTest#testReplacingDocumentWhenClean() was
intermittently failing with a 5-second timeout. The test would wait at a
barrier expecting the reconciler to process, but the reconciler thread
never executed the process() method.

Root Cause:
-----------
The race condition occurred in the BackgroundWorker.reset() method due to
improper ordering of operations with two separate locks (the BackgroundWorker
instance lock and the fDirtyRegionQueue lock):

Old sequence:
1. Set fIsDirty=true and fReset=true (under BackgroundWorker lock)
2. Notify fDirtyRegionQueue
3. Call informNotFinished() which calls aboutToWork()
4. aboutToWork() may call signalWaitForFinish() which sets waitFinish=true

The problem: The reconciler thread could wake up from the notification in
step 2 BEFORE waitFinish was set to true in step 4, causing it to delay
again for the full fDelay period instead of processing immediately.

Additionally, due to the two separate locks, the reconciler thread could
check isDirty() and see a stale value before fIsDirty was set.

The Fix:
--------
Move the informNotFinished() call to AFTER setting fIsDirty/fReset but
BEFORE notifying the queue:

New sequence:
1. Set fIsDirty=true and fReset=true (under BackgroundWorker lock)
2. Call informNotFinished() → aboutToWork() → signalWaitForFinish()
3. signalWaitForFinish() sets waitFinish=true and notifies queue
4. Final notifyAll() on queue (redundant but harmless)

This ensures that when the reconciler thread wakes up, both fIsDirty and
waitFinish are already set to their correct values, eliminating the race.

Testing:
--------
- Compiled successfully with mvn clean compile -Pbuild-individual-bundles
- The fix preserves the existing behavior while ensuring proper synchronization
- FastAbstractReconcilerTest.testReplacingDocumentWhenClean should no longer timeout

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@vogella vogella marked this pull request as draft October 25, 2025 11:19
@github-actions
Copy link
Contributor

Test Results

 3 018 files  ±0   3 018 suites  ±0   2h 11m 34s ⏱️ - 7m 41s
 8 233 tests ±0   7 984 ✅ ±0  249 💤 ±0  0 ❌ ±0 
23 619 runs  ±0  22 825 ✅ ±0  794 💤 ±0  0 ❌ ±0 

Results for commit 38317f9. ± Comparison against base commit ce17c9e.

@vogella
Copy link
Contributor Author

vogella commented Oct 25, 2025

@HeikoKlare can you review? This is not only a test change and I have not worked in this area before so it would be good to get your opinion. All tests are passing so I think this is a good point in time to review the change

@HeikoKlare
Copy link
Contributor

I have to admit that I have no clue about this ode. The provided analysis for the root cause seems sound, but I am not enough into the code to judge whether this a proper fix.

I took a look into history and found that a change some years ago introduced an additional synchronization which is related to the race condition: 5dafae8

And I also see that @laeubi has made some changes in the recent past on this part of the code, so maybe he can make a better assessment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Random failing org.eclipse.jface.text.tests.reconciler.FastAbstractReconcilerTest#testReplacingDocumentWhenClean()

2 participants