-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: large literal maps excessively contribute to binary size #71937
Comments
Maps have a hash seed that is chosen at startup dynamically (to avoid collision attacks). So the layout of a map changes from run to run; we cannot lay it out statically. If the large literal map has all constant entries, it should be loaded from a []struct{key;value} array. That's about as compact as we could get (without a decompression step). Can you post your example (or a shrunken version, say ~100 representative entries)? It would be interesting to see what case you're hitting above. |
Even large arrays can contribute to large binary sizes. For example, this 104 entry table in the edge package adds 120 bytes of ARM code per line. 5KB of source generates 13KB of code. The total information content of each row is a triple (index, type argument, string). The indices are worth zero bits, because they are sequential; the type argument is a choice among 55, and the strings are a choice among 47 values, so a total of 12 bits. If it wasn't important for each line to have a unique breakpoint, the compiler could reduce the entire thing to a loop over a compact table of about 200 bytes with two side tables of 55 reflect.Types and 47 strings respectively, for a total of 1.8KB.
|
Yes, if each entry (in a map, or a slice) involves a function call to get the entry's contents, then it is going to be a lot of generated code. |
Yeah, I meant a fantasy compiler could implement a loop "un-unrolling" optimization when it sees a sequence of statements each of the form |
Go version
go version go1.24.0 linux/amd64
Output of
go env
in your module/workspace:What did you do?
Compile a program with dependencies with large literal maps (global or not)
What did you see happen?
A literal map with ~2000 entries inflated the final binary size with ~525kB
What did you expect to see?
Assuming that a gob encoded version of the map is an efficient representation of the map, I assume that the binary should inflate by approximately that much. The gob encoded version of that map was 52kB, or a 1/10th of what actually happens.
What is the reason a map cannot be stored literally in the final binary instead of initializing each value separately and causing a huge (implicit)
init()
function? I'm sure there's a good reason, but is there room for improvement? Besides, would inserting a large number of entries "at once" be algorithmically faster than inserting one by one (thinking about parallels with priority queues / heaps)?See #20095 for a similar issue. Also related partly to #6853. May fix or improve #19751.
See yuin/goldmark#469 (comment) for a case study.
The text was updated successfully, but these errors were encountered: