Skip to content

DEBUG #294: stress ResourceInitialSelectionTest with diagnostics#10

Open
vogella wants to merge 3 commits intomasterfrom
investigate/issue-294-resource-initial-selection
Open

DEBUG #294: stress ResourceInitialSelectionTest with diagnostics#10
vogella wants to merge 3 commits intomasterfrom
investigate/issue-294-resource-initial-selection

Conversation

@vogella
Copy link
Copy Markdown
Owner

@vogella vogella commented Apr 16, 2026

Fork-internal PR to trigger CI on macOS/Windows runners and capture diagnostic output for issue eclipse-platform#294 (ResourceInitialSelectionTest flake).

What this does

Why a fork PR

Linux cannot reproduce locally (200/200 pass under DISPLAY=:1). The historic failures are predominantly macOS, occasionally Windows. Running CI on the fork lets us collect Mac/Win evidence without burning Eclipse runner minutes or polluting the upstream PR list.

Not for merge

This is debug-only. A real fix will go to upstream once the failure pattern is understood.

Adds @RepeatedTest(50) to the tests reported as flaky in issue eclipse-platform#294,
plus per-run logging of waitForDialogRefresh stability and table state
at assertion time.

Linux cannot reproduce the flake locally (200/200 pass under DISPLAY=:1);
this branch exists only to trigger CI on macOS/Windows runners via a
fork-internal PR so we can capture itemCount, selection, and the time
spent in waitForDialogRefresh on a failing run.

Will be reverted before any fix is merged.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces diagnostic tools to investigate flaky behavior in ResourceInitialSelectionTest.java, including converting several tests to repeated executions and adding detailed logging of the dialog state. A new helper method, dumpDialogState, is added to log table items and selections. Feedback was provided regarding the fragile UI traversal logic used to locate the table widget, which relies on hardcoded indices and should be refactored for better maintainability.

Comment on lines +497 to +498
Table table = (Table) ((Composite) ((Composite) ((Composite) dialog.getShell().getChildren()[0])
.getChildren()[0]).getChildren()[0]).getChildren()[3];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The UI traversal logic used to locate the Table widget is highly fragile as it relies on hardcoded indices (e.g., getChildren()[0], getChildren()[3]). This makes the test sensitive to any internal layout changes in the dialog. Since this logic is also duplicated from waitForDialogRefresh, it would be better to refactor it into a more robust helper method that searches for the Table by type or property to improve maintainability.

@vogella vogella force-pushed the investigate/issue-294-resource-initial-selection branch 8 times, most recently from dc845ea to 2ba3342 Compare April 17, 2026 09:02
FilteredItemsSelectionDialog's background pipeline has three jobs that
read and write the same ContentProvider state: FilterHistoryJob mutates
it via contentProvider.reset() and contentProvider.addHistoryItems(),
FilterJob populates it via contentProvider.add() (inside
fillContentProvider), and RefreshCacheJob reads it in
contentProvider.reloadCache(). Without a shared scheduling rule they can
run concurrently on different worker threads.

Under the right timing, FilterHistoryJob.contentProvider.reset() clears
the items set on one worker while FilterJob is iterating members() and
adding items on another - wiping some or all of the populated items
before the table is rendered. This manifested as the long-standing
ResourceInitialSelectionTest flake (issue eclipse-platform#294): sometimes the table was
empty, sometimes a random subset of items survived in the wrong order.

The race is reliably triggered by the ModifyListener on the pattern Text
firing applyFilter() twice in quick succession during createDialogArea
on slow runners, which schedules two FilterHistoryJobs that overlap with
the FilterJob.

Fix: share a per-dialog ISchedulingRule across filterHistoryJob,
filterJob and refreshCacheJob. They now serialize; the race cannot
occur. Separate dialogs still run in parallel.

Also expose JOB_FAMILY (@noreference) and tag all four pipeline jobs
with it, so tests can deterministically wait for the pipeline via
Job.getJobManager().find(JOB_FAMILY). The item-count stability probe in
ResourceInitialSelectionTest.waitForDialogRefresh() is replaced with
that family-based wait - this also gets rid of the unreliable 5 s
timeout that was the visible symptom.

Fixes eclipse-platform#294
@vogella vogella force-pushed the investigate/issue-294-resource-initial-selection branch from 2ba3342 to c600a68 Compare April 17, 2026 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant