Skip to content

Overhaul of how type inference handles capture bounds#4735

Merged
stephan-herrmann merged 6 commits into
eclipse-jdt:masterfrom
stephan-herrmann:issue4635
Jan 17, 2026
Merged

Overhaul of how type inference handles capture bounds#4735
stephan-herrmann merged 6 commits into
eclipse-jdt:masterfrom
stephan-herrmann:issue4635

Conversation

@stephan-herrmann
Copy link
Copy Markdown
Contributor

The PR brings some parts of our implementation closer to JLS, while adding extra-constitutional tweaks to compensate for regressions caused by the former set of changes.

Closer to JLS:

  1. Stop to clear all capture bounds at end of incorporate - we have long been aware that clearing was wrong, but we kept this wrong to prevent a flood of regressions with no clue how to fix them
  2. Rewrite dependency-computation - I recently noticed that our implementation in the vicinity of InferenceContext18.getSmallestVariableSet() is quite beside the point. This is a full re-implementation - currently ignoring performance aspects until we have measurements of how bad things are.

NON-JLS

Actually I added a new task tag "NON-JLS" to help find all relevant code locations that are based on our own invention rather than JLS.

  1. Capture bounds are now maintained in 2 sets: captures and allCaptures, so that different code locations can leverage or ignore capture bounds created during nested inference.
  2. 2nd attempt of resolution will now consider same-bounds if substitution makes them proper.
  3. Our constant InferenceContext18.ARGUMENT_CONSTRAINTS_ARE_SOFT encoded the advice that only specific expression constraints should be "soft" (which is a NON-JLS concept in its own). I identified a test case which requires that all expression constraints are "soft".
  4. Several regressions could be fixed by more sophisticated computation of dependencies: given α :> β do not consider α to depend on β (only the reverse dependency from β to α will be considered in that mode). If, however, this tweak would make resolve() fail, then fall back to the original version were α :> β is seen as a bidirectional dependency.

Finally I had to include the fix from #4565 to fix the remaining regressions from the initial changes.

Fixes #4635

Phase 1: Stop to clear all capture bounds at end of incorporate
+ remove only those bounds that have produced a new TypeBound
+ incl. new failing test

Result of TestAll at compliance 25: 2 failures

Fixes eclipse-jdt#4635
Phase 2: Rewrite dependency-computation
+ new method IC18.collectDependencies() creates a map iv -> iv*
  - use that map also in pickFromCycle
  - re-implement dependsOn based on that map,
    this replaces BoundSet.dependsOnResolutionOf() et al
  - obsoletes also ThreeSets.inverseBounds
+ shrink set of vars to resolve during IC18.resolve()
+ maintain separate sets captures / allCaptures
  - IC18.resumeSuspendedInference resets captures but not allCaptures
+ avoid CaptureBinding18.upperBounds == null
+ fully enable the new test

Result of TestAll at compliance 25: 39 failures

Fixes eclipse-jdt#4635
Compensate regressions with NON-JLS improvement eclipse-jdt#1:
+ consider existing same bounds if substitution makes them proper

Result of TestAll at compliance 25: 11 failures

Fixes eclipse-jdt#4635
Compensate regressions with NON-JLS improvement eclipse-jdt#2:
+ consider *all* expression constraints as soft

Result of TestAll at compliance 25: 10 failures

Fixes eclipse-jdt#4635
Compensate regressions with NON-JLS improvement eclipse-jdt#3:
+ try to resolve ignoring super-dependencies
  - fall back to considering those, too, if resolve would then fail
+ adjust 2 expected results in NullTypeAnnotationTest

Result of TestAll at compliance 25: 1 failure

Fixes eclipse-jdt#4635
Adopt test & fix from PR 4565

Also-by: coehlrich <coehlrich@users.noreply.github.com>
@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

stephan-herrmann commented Jan 11, 2026

@srikanth-sankaran @jarthana @coehlrich this PR marks a significant change to type inference, so I'd like to invite you to comment on this multi stage saga:

two steps forward ...

I identified two aspects where ECJ can and should be brought closer to JLS:

computing dependencies between inference variables

Initial investigation of #4635 indicated that dependency computation for 18.4. needed a fix. I realized that helper method BoundSet.dependsOnResolutionOf() had some significant logic errors - the method wasn't nearly doing what JLS requires. So I made a fresh start resulting in IC18.collectDependencies() as close to JLS as I could, with some clean-up in that area.

As the improved implementation still didn't help for #4635 I realized that it was working on wrong input: at this stage of type inference BoundSet.captures was always empty, because ...

capture bounds

For twelve years I was hesitant about the following line at the end in BoundSet.incorporate():

this.captures.clear();

(see 18.1.3. for the definition of capture bounds).
I believe it was a simple accident I ever wrote this line, but soon after I discovered that this line protected us against a flood of impending regressions. Now finally I dared to remove this wrong clear().

Up to this point my changes should have brought us closer to JLS, but it brought us to a great distance to what we should be doing according to javac :(

Then I braced myself for addressing all those "regressions" for whatever it costs.

... and many steps backwards

All subsequent changes apply extra-constitutional tweaks in the hope that my tweaks might be morally similar to the many such tweaks which javac is reportedly applying.

  • commit 2 duplicates BoundSet.captures to BoundSet.allCaptures, so that individual code locations can opportunistically decide whether capture bounds from inner inference should be considered or not (captures is restored to previous state in IC18.resumeSuspendedInference() while allCaptures is not).
  • commit 3 adds to the algo of the "second attempt" in 18.4: JLS tells us to leverage upper and lower bounds of a given ivar, but doesn't mention same-bounds. With help of the given substitution these same-bounds may essentially amount to instantiations, giving the most precise result we could wish for.
  • commit 4 reacts to one test case that requires ConstraintTypeFormula.addConstraintsFromTypeParameters() to (illegally) answer true when presented with a raw type and a parameterized type. Back in the days I was informed that javac considers raw types as compatible in some situations, but as the given test was outside the blessed condition, I had to enable this tweak for all constraints.
  • commit 5 is my own answer to unanswered post https://mail.openjdk.org/pipermail/compiler-dev/2026-January/032467.html : given ivars α :> β, does this constitute a bi-directional dependency or is only β dependent on α? Some cases require the uni-directional interpretation, but others need both directions. So I decided to try both interpretations. If unidirectional dependency causes a resolution failure, try bidirectional instead.
  • commit 6 finally adopts the fix from Fix 3 layer nested method calls with the return type includes a wildcard #4565

In the end even captures.clear() wasn't removed entirely but replaced opportunistically with selective removal of bounds that had been operated on already.

risks?

In a way I was surprised myself that this series of tweaks gradually fixed all the intermediate regressions.

In light of size and significance of the change I was going to ask someone for close code review, but I'm afraid this would keep the reviewer busy for long time. More time than we have available. Actually I'm not sure if a clear yes or no can possibly be given to this PR. The status quo, OTOH, has known defects. And a third solution is not in sight.

My feeling is: as long as ECJ has those obvious bugs wrt JLS, we cannot expect help from Oracle with any difficulties we are facing in bug reports of this area. Perhaps we should do both: first do exactly as JLS mandates, and then on-top apply any opportunistic tweaks that help accepting more programs which also javac accepts (based just on common sense reasoning plus experiments).

To mitigate the inherent risk in this, I'd like to ask if people could help STRESS TESTING this change. I made a start by compiling all of the Eclipse SDK with no new errors reported. ✔️

I will add more details about individual tweaks and what I learned along the road, and be it only as a way of self-review 😄

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

relevance of ivar dependencies

Why is it crucial that resolution (18.4) works with exact dependencies between ivars?

Each ivar is put into a group with all other ivars on which it (transitively) depends. ivars in the same group are then resolved in one batch. If the first strategy in 18.4. fails for any ivar in that set, all ivars will be resolved using the "second attempt" where fresh type variables "Yi" are assigned (our code uses the symbol Z from previous versions of JLS).

If, e.g., an ivar α :> String is resolved in isolation we can easily instantiate it to String. Other ivars being resolved in subsequent batches can then already leverage this instantiation for better results.

If, however, any ivar in the same group forces resolution into the "second attempt", then the upper bound will be preserved as an upper bound, resulting in a type variable Y which is not compatible with String. In this case other ivars depending on α will not be able to leverage the String bound in a useful way as we saw in the single ivar case above.

So, resolving ivars in smaller groups may give better results, provided that the sharp result for one ivar will not cause subsequent resolution to fail.

While dependencies are well-defined in 18.4. we have these two weak spots:

  • this computation depends on capture bounds, where still funny things are happening
  • we have that question of bi-directional / uni-directional dependencies from α :> β

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

bi-directional dependencies

In one test I observed

	R#2 :> T#3
	R#2 :> String
	T#4 :> String
	R#2 :> T#4

If we interpret R#2 as depending on T#3 and T#4, then these three variables need to be resolved in one batch, where the only thing we know about T#3 is, that it is a subtype of some other ivar. Due to lack of proper bounds we instantiate T#3 to Object. At the same time we set R#2 to String which will be incorporated yielding the bound String :> Object, which is FALSE.

If OTOH, we consider R#2 as not significantly determined by any other ivar, we can resolve it in isolation, with String as its instantiation. Then the next round of resolution will already see a bound String :> T#3, so also T#3 is instantiated to String. ✔️

So considering R#2 as not depending on its super T#3 can be helpful. Can it hurt? Yes, I saw test cases where ignoring that super-dependency causes failure. But when this failure is detected, we can still fall back to the previous interpretation, i.e., we can simply choose, which strategy gives better results.

I noticed that this tweak allows inference to give better results in particular in situations of inferring a standalone expression, as seen at flatMap() in this test:

import java.util.*;
import java.util.stream.*;
import static java.util.Arrays.asList;

public class C {
    static final List<Integer> DIGITS = Collections.unmodifiableList(asList(0,1,2,3,4,5,6,7,8,9));

    Collection<String> flatMapSolutions(final boolean b) {
        Collection<String> solutions =
           DIGITS.stream().flatMap( s -> {
                return b ? Stream.empty() : Stream.of("");
           }) .collect(Collectors.toList());
        return solutions;
    }
}

Here Stream.of("") hints at String as a possible solution, but in the unhappy case, inference can ignore this hint, and succeed with a inferior result, which then makes the enclosing assignment fail type checking (collect() propagates the inferior inference result).

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

stephan-herrmann commented Jan 11, 2026

same-bounds

The "second attempt" in 18.4 seems to ignore type bounds of a shape like α = List<β>. But doesn't such a bound trivially imply α <: List<β> (<: is reflexive). Hence I imagine that failure to mention same-bounds in this section of JLS might be a simple oversight. Perhaps they even assumed that readers would implicitly regard a same-bound as a combination of both an upper and a lower bound?

Edit: As per a quick experiment: Replacing the previous fix regarding same-bounds with a solution where BoundSet.upperBounds() would include same-bounds in its answer, causes plenty of test failures -> abandoning that train of thought. The existing solution in commit 3 of this PR is superior.

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

soft constraints

It was one of the first admissions by Oracle, that javac is wrong in assuming subtyping between a raw type and a parameterized type. If they don't care to fix this in 12 years, then why should we care about the extent of that bug?

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

capture bounds

The interaction between inference and inner inference isn't sufficiently specified in JLS. In this area we are already dependent on advice from javac developers. So I see leeway in interpreting whether or not capture bounds from inner inference should be considered in outer inference. Hence keeping separate sets captures and allCaptures, where only the latter accepts capture bounds from inner, shouldn't cause bad feelings.

I'm a bit uneasy about the follow-up questions of which code location should use which of these sets, but here I fully rely on "if it helps, its good". As we are inside a blind spot of JLS, we cannot expect any help from JLS.

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

What are the chances that I can get some comments any time soon?

If brain capacity for this saga is low atm, perhaps I may ask @jarthana for some stress testing?

I'd like to merge this for M2, to give it more exposure sooner rather than later. We had some recent regressions in the wider area of type inference, and I'm hesitant to address any of them without first bringing this PR to conclusion.

@iloveeclipse
Copy link
Copy Markdown
Member

What are the chances that I can get some comments any time soon?

@coehlrich - you might be interested?

I'd like to merge this for M2, to give it more exposure sooner rather than later.

If nobody answers, please.

@stephan-herrmann stephan-herrmann merged commit a3826aa into eclipse-jdt:master Jan 17, 2026
13 checks passed
@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

I'd like to merge this for M2, to give it more exposure sooner rather than later.

If nobody answers, please.

Done.

Of course I'm still interested in comments.

@jarthana
Copy link
Copy Markdown
Member

If brain capacity for this saga is low atm, perhaps I may ask @jarthana for some stress testing?

Looks good.

@stephan-herrmann
Copy link
Copy Markdown
Contributor Author

For posterity: also this part seems to become unnecessary by wip in #5004:

commit 4 reacts to one test case that requires ConstraintTypeFormula.addConstraintsFromTypeParameters() to (illegally) answer true when presented with a raw type and a parameterized type. Back in the days I was informed that javac considers raw types as compatible in some situations, but as the given test was outside the blessed condition, I had to enable this tweak for all constraints.

stephan-herrmann added a commit to stephan-herrmann/eclipse.jdt.core that referenced this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Type mismatch with generic wildcard

3 participants