During analysis of two heapdumps shared here the following issue was discovered as a hotspot that can benefit from optimization:
Performance Data
| Metric |
WITH transitive |
WITHOUT transitive |
Ratio |
| ClasspathLocation.getModulesDeclaringPackage() (µs) |
7,443,858 |
1,671,880 |
4.5× |
| ClasspathLocation.findClass() (µs) |
4,129,889 |
1,639,802 |
2.5× |
Description
getModulesDeclaringPackage() iterates over all binary locations and source locations when the lookup strategy is Any or Unnamed:
public char[][] getModulesDeclaringPackage(char[][] packageName, char[] moduleName) {
// ...
case Any:
case Unnamed:
char[][] names = CharOperation.NO_CHAR_CHAR;
for (ClasspathLocation location : this.binaryLocations) { // linear scan
char[][] declaringModules = location.getModulesDeclaringPackage(pkgName, null);
if (declaringModules != null)
names = CharOperation.arrayConcat(names, declaringModules); // array copy each time!
}
for (ClasspathLocation location : this.sourceLocations) { // another linear scan
// same pattern
}
With transitive dependencies, binaryLocations is significantly larger, and this method is called for every package reference during compilation. The CharOperation.arrayConcat() call on every match creates a new array each time, leading to O(n²) allocation behavior when many locations declare the same package.
Similarly, findClass() iterates through all binaryLocations linearly until a match is found:
for (ClasspathLocation classpathLocation : relevantLocations) {
NameEnvironmentAnswer answer = classpathLocation.findClass(...);
// ...
}
Suggested Fix
- Index binary locations by package name: Build a
Map<String, List<ClasspathLocation>> so that getModulesDeclaringPackage() only checks locations known to contain a given package, rather than iterating all locations.
- Use ArrayList +
toArray() instead of repeated arrayConcat(): The current approach of appending to char[][] via arrayConcat() is O(n²). Collect results in a List first.
- Cache package-to-module mappings: Since the classpath doesn't change during a build, cache the result of
getModulesDeclaringPackage() for each (packageName, moduleName) pair.
During analysis of two heapdumps shared here the following issue was discovered as a hotspot that can benefit from optimization:
Performance Data
Description
getModulesDeclaringPackage()iterates over all binary locations and source locations when the lookup strategy isAnyorUnnamed:With transitive dependencies,
binaryLocationsis significantly larger, and this method is called for every package reference during compilation. TheCharOperation.arrayConcat()call on every match creates a new array each time, leading to O(n²) allocation behavior when many locations declare the same package.Similarly,
findClass()iterates through allbinaryLocationslinearly until a match is found:Suggested Fix
Map<String, List<ClasspathLocation>>so thatgetModulesDeclaringPackage()only checks locations known to contain a given package, rather than iterating all locations.toArray()instead of repeatedarrayConcat(): The current approach of appending tochar[][]viaarrayConcat()is O(n²). Collect results in aListfirst.getModulesDeclaringPackage()for each(packageName, moduleName)pair.