Implement proper surrogate pair handling in JavaWordFinder by vogella · Pull Request #2922 · eclipse-jdt/eclipse.jdt.ui

vogella · 2026-04-02T08:44:59Z

PR #2977 removed the stale ICU comments from JavaWordFinder and replaced them with a simple surrogate skip (!Character.isSurrogate(c)). That avoids the old placeholder comments but still does not correctly handle supplementary Unicode characters in Java identifiers.

This PR replaces that shortcut with proper code-point logic: when a surrogate char is encountered during a word scan, the adjacent char is read to form the full code point, which is then tested with Character.isJavaIdentifierPart(int). Identifiers containing supplementary Unicode characters are now included in the word region correctly, and unpaired or non-identifier surrogates still terminate the scan.

Replace the simplistic surrogate skip (from eclipse-jdt#2977) with full code-point checking: when a surrogate char is encountered, form the code point from the pair and test it with Character.isJavaIdentifierPart(int), so identifiers containing supplementary Unicode characters are correctly included in the word region.

vogella force-pushed the fix-java-word-finder branch from 4eab49f to b6d5ad7 Compare May 14, 2026 04:14

vogella changed the title ~~Replace ICU-related comments with standard Java code in JavaWordFinder~~ Implement proper surrogate pair handling in JavaWordFinder May 14, 2026

vogella force-pushed the fix-java-word-finder branch from b6d5ad7 to 143821b Compare May 14, 2026 04:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement proper surrogate pair handling in JavaWordFinder#2922

Implement proper surrogate pair handling in JavaWordFinder#2922
vogella wants to merge 1 commit into
eclipse-jdt:masterfrom
vogella:fix-java-word-finder

vogella commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vogella commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vogella commented Apr 2, 2026 •

edited

Loading