Skip to content

JavaProject.computeExpandedClasspath() recursively expands without memoization #4928

@laeubi

Description

@laeubi

During analysis of two heapdums shared here the following issue was discovered as a hotspot that can benefit from optimization:

Performance Data

Metric WITH transitive WITHOUT transitive Ratio
Self time (µs) 6,701,978 678,778 9.9×

Description

computeExpandedClasspath() recursively walks the project dependency graph, expanding all CPE_PROJECT entries by calling getResolvedClasspath() and then recursing into each referenced project. With transitive dependencies, this traversal becomes significantly deeper and wider:

private void computeExpandedClasspath(
    ClasspathEntry referringEntry, HashMap<String, Boolean> rootIDs,
    ArrayList<ClasspathEntry> accumulatedEntries, boolean excludeTestCode) throws JavaModelException {

    IClasspathEntry[] resolvedClasspath = getResolvedClasspath(); // expensive
    for (IClasspathEntry cpe : resolvedClasspath) {
        ClasspathEntry entry = (ClasspathEntry) cpe;
        if (excludeTestCode && entry.isTest()) continue;  // calls isTest() on every entry
        if (isInitialProject || entry.isExported()) {
            if (entry.getEntryKind() == IClasspathEntry.CPE_PROJECT) {
                // ... recurse into required project
                javaProject.computeExpandedClasspath(combinedEntry, rootIDs, accumulatedEntries, ...);
            }
        }
    }
}

The recursive expansion calls getResolvedClasspath() for the same project multiple times if it is a transitive dependency of multiple projects. The rootIDs HashMap prevents duplicate entries but not duplicate traversal — a project is only skipped if its root ID was already seen, but the traversal and getResolvedClasspath() call happen before the check for CPE_PROJECT entries.

Additionally, there is an O(n²) linear search when updating existing entries (lines 584–592):

for (int j = 0; j < accumulatedEntries.size(); j++) {
    ClasspathEntry oldEntry = accumulatedEntries.get(j);
    if (oldEntry.rootID().equals(rootID)) {  // linear scan!
        accumulatedEntries.set(j, oldEntry.withExtraAttributeRemoved(...));
        break;
    }
}

With 100+ transitive classpath entries, this linear search is called frequently, resulting in thousands of iterations.

Suggested Fix

  1. Memoize expanded classpath per project: Cache the expanded classpath result per (project, excludeTestCode) pair so that re-expanding the same project is O(1).
  2. Replace ArrayList linear search with a HashMap index: The linear scan at lines 584–592 should use a HashMap<String, Integer> mapping rootID → index in accumulatedEntries, turning O(n) lookups into O(1).
  3. Move the rootIDs.containsKey() check before getResolvedClasspath(): For non-project entries, the containsKey check already short-circuits. But for project entries, the resolved classpath is obtained before recursing, even if the project will be skipped.
  4. Consider a topological sort of project dependencies and expanding bottom-up to avoid redundant work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions