During analysis of two heapdums shared here the following issue was discovered as a hotspot that can benefit from optimization:
Performance Data
| Metric |
WITH transitive |
WITHOUT transitive |
Ratio |
| Self time (µs) |
6,701,978 |
678,778 |
9.9× |
Description
computeExpandedClasspath() recursively walks the project dependency graph, expanding all CPE_PROJECT entries by calling getResolvedClasspath() and then recursing into each referenced project. With transitive dependencies, this traversal becomes significantly deeper and wider:
private void computeExpandedClasspath(
ClasspathEntry referringEntry, HashMap<String, Boolean> rootIDs,
ArrayList<ClasspathEntry> accumulatedEntries, boolean excludeTestCode) throws JavaModelException {
IClasspathEntry[] resolvedClasspath = getResolvedClasspath(); // expensive
for (IClasspathEntry cpe : resolvedClasspath) {
ClasspathEntry entry = (ClasspathEntry) cpe;
if (excludeTestCode && entry.isTest()) continue; // calls isTest() on every entry
if (isInitialProject || entry.isExported()) {
if (entry.getEntryKind() == IClasspathEntry.CPE_PROJECT) {
// ... recurse into required project
javaProject.computeExpandedClasspath(combinedEntry, rootIDs, accumulatedEntries, ...);
}
}
}
}
The recursive expansion calls getResolvedClasspath() for the same project multiple times if it is a transitive dependency of multiple projects. The rootIDs HashMap prevents duplicate entries but not duplicate traversal — a project is only skipped if its root ID was already seen, but the traversal and getResolvedClasspath() call happen before the check for CPE_PROJECT entries.
Additionally, there is an O(n²) linear search when updating existing entries (lines 584–592):
for (int j = 0; j < accumulatedEntries.size(); j++) {
ClasspathEntry oldEntry = accumulatedEntries.get(j);
if (oldEntry.rootID().equals(rootID)) { // linear scan!
accumulatedEntries.set(j, oldEntry.withExtraAttributeRemoved(...));
break;
}
}
With 100+ transitive classpath entries, this linear search is called frequently, resulting in thousands of iterations.
Suggested Fix
- Memoize expanded classpath per project: Cache the expanded classpath result per
(project, excludeTestCode) pair so that re-expanding the same project is O(1).
- Replace ArrayList linear search with a HashMap index: The linear scan at lines 584–592 should use a
HashMap<String, Integer> mapping rootID → index in accumulatedEntries, turning O(n) lookups into O(1).
- Move the
rootIDs.containsKey() check before getResolvedClasspath(): For non-project entries, the containsKey check already short-circuits. But for project entries, the resolved classpath is obtained before recursing, even if the project will be skipped.
- Consider a topological sort of project dependencies and expanding bottom-up to avoid redundant work.
During analysis of two heapdums shared here the following issue was discovered as a hotspot that can benefit from optimization:
Performance Data
Description
computeExpandedClasspath()recursively walks the project dependency graph, expanding allCPE_PROJECTentries by callinggetResolvedClasspath()and then recursing into each referenced project. With transitive dependencies, this traversal becomes significantly deeper and wider:The recursive expansion calls
getResolvedClasspath()for the same project multiple times if it is a transitive dependency of multiple projects. TherootIDsHashMap prevents duplicate entries but not duplicate traversal — a project is only skipped if its root ID was already seen, but the traversal andgetResolvedClasspath()call happen before the check forCPE_PROJECTentries.Additionally, there is an O(n²) linear search when updating existing entries (lines 584–592):
With 100+ transitive classpath entries, this linear search is called frequently, resulting in thousands of iterations.
Suggested Fix
(project, excludeTestCode)pair so that re-expanding the same project is O(1).HashMap<String, Integer>mappingrootID→ index inaccumulatedEntries, turning O(n) lookups into O(1).rootIDs.containsKey()check beforegetResolvedClasspath(): For non-project entries, the containsKey check already short-circuits. But for project entries, the resolved classpath is obtained before recursing, even if the project will be skipped.