-
Notifications
You must be signed in to change notification settings - Fork 8k
ROR the callable_convert_cache key for better hash distribution #20052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The bottom 3 bits are zeros due to alignment, to get better key distributions we should ROR the 3 bottom bits. This results in a single extra instruction on x86 or arm64, which is negligible for the benefit we get.
You should rotate by the nearest lower power of two of min(sizeof(zend_internal_function), sizeof(zend_op_array)). I.e. ror by 7 instead of 3. |
Right, good point! On 64 bit zend_internal_function is 160B and zend_op_array is 256B. |
Actually, Bob's argument turns out not to be true: zend_internal_functions in EG(function_table) are allocated on the system heap. The system heap on my glibc system gives me an address with only the lowest 4 bits zeros because it adheres to an alignment of 16 bytes. So I'm keeping this PR as-is. |
If they are always 0, rotating them doesn't really do anything either. Just a shift should be sufficient. weakrefs also use Lines 60 to 64 in bcb5612
Should we have a generic function for pointer -> hash index? |
The alignment guarantee is technically only 4 bytes for 32 bit on some platforms. So we can't throw away the bottom bits unconditionally. The weakmap case is fine as it's coming from ZendMM which has alignment guarantees we control. |
@nielsdos It's not a problem though, to throw these bits away (also we're not actually throwing them away, merely shifting to the end of the key). I mean, you're never going to have overlapping functions. The goal is to have a high entropy within the least significant bits, which you will still have for internal functions if you only consider bits 6/7 and upwards - only that if you don't, the entropy is low for userland functions. |
zval *closure_zv = zend_hash_index_lookup(&EG(callable_convert_cache), (zend_ulong)(uintptr_t)call->func); | ||
/* Rotate the key for better hash distribution. */ | ||
const int shift = sizeof(size_t) == 4 ? 6 : 7; | ||
zend_ulong key = (zend_ulong)(uintptr_t)call->func; | ||
key = (key >> shift) | (key << ((sizeof(key) * 8) - shift)); | ||
zval *closure_zv = zend_hash_index_lookup(&EG(callable_convert_cache), key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rotated address is not significantly better. It's better to choose one of the well known hash function for integers e.g. from https://arxiv.org/pdf/1504.06804
This is probably a problem in a lot of places, no? Why is a specialised solution needed in this particular instance? |
There are more places where we could do something like this. And in some places we already employ this (kind of) solution. |
The implementation will currently effectively only use one bucket, which is obviously bad. This looks like an improvement at least, though we may also just switch back to using the string name as a key for now. I'll leave that up to you. |
I'll merge this already as an improvement as-is. It's not perfect, and not generic, but that can always be iterated on later. It's better than just using a single bucket (for low nr of entries). I'll put on my TODO to look at papers linked by Dmitry. |
This is also part of a wider spread issue that also affects integer-indexed arrays in userland too (e.g. arrays storing data in coordinate systems where the key is In any case, I for one won't complain about faster first-class callables 👍🏻 |
Related: https://externals.io/message/96007 |
The bottom 3 bits are zeros due to alignment, to get better key distributions we should ROR the 3 bottom bits.
This results in a single extra instruction on x86 or arm64, which is negligible for the benefit we get.