rvgo: experimental no-alloc noescape merkle funcs #88
+67
−37
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
A little bit hacky. And makes the assumption there's only 1 concurrent hasher (this could be fixed by allocating a
keccakState
per Memory merkleization job, and passing it as typed function argument through the merkle functions).go:noescape
andgo:linkname
to force it to not assume the function interface, and not allow any slice data to escape to the heapThis essentially removes all the non-essential allocations from the merkleization work, in a bit of an unsafe non-recommended way, to see how much performance can improve with this kind of heap-analysis informed memory optimization.
Profiles of the benchmark show that it's only a little bit faster, even after removing a theorectical ~10 million object allocations.
Before:
![before](https://private-user-images.githubusercontent.com/19571989/377079872-bbc9adcc-6e9b-4c13-bcd7-d6819ccffddc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzODIyNjQsIm5iZiI6MTczOTM4MTk2NCwicGF0aCI6Ii8xOTU3MTk4OS8zNzcwNzk4NzItYmJjOWFkY2MtNmU5Yi00YzEzLWJjZDctZDY4MTljY2ZmZGRjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE3MzkyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWJiNDMzMjk3ZGViOWY2ODQ5YTdiNGY5ODhkMDQ0MDJkNDllYzU0NTY1MGE4MDA5MzU4NDJhYzU5MmU2MzJlMjImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Zk9m7LQeyVeEXLdhPv9ztuQ3MSham--5V5_amfHN4Kc)
After:
![after](https://private-user-images.githubusercontent.com/19571989/377079930-848458fc-e36e-4beb-b0b9-c2ed5d76bb60.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzODIyNjQsIm5iZiI6MTczOTM4MTk2NCwicGF0aCI6Ii8xOTU3MTk4OS8zNzcwNzk5MzAtODQ4NDU4ZmMtZTM2ZS00YmViLWIwYjktYzJlZDVkNzZiYjYwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE3MzkyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTI0YWRlOWE0MWY3ZDRhNzE5MzVkYmUzNDU2M2JiMTdlMzlkOTA0MDZhOTFmYTYxODAyZjc1YTliMDRhZDYzMDEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.CZ_DMt9LuE4zT1D666eDTO9q1LeOfExARrTkV4uT01o)
CPU:
Before:
After:
Heap-escape:
Thought I'd share this little experiment and the results.