-
Notifications
You must be signed in to change notification settings - Fork 82
boost::unrodered_flat_map for join #3824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: stable-23.10
Are you sure you want to change the base?
Conversation
|
performance comparison with stable-23.10 https://palantir.mariadb.net/report_comparisons/74/ |
| typedef typename std::tr1::unordered_multimap<typename element_t::second_type, | ||
| typename element_t::first_type>::value_type hashPair_t; | ||
| // Now using boost::unordered_flat_map with vector for better cache locality | ||
| typedef typename boost::unordered_flat_map<typename element_t::second_type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using instead of typedef?
| utils::SimpleAllocator<std::pair<uint64_t const, uint64_t> > > | ||
| hash_t; | ||
| // Using boost::unordered_flat_map with vector for multimap behavior | ||
| typedef boost::unordered_flat_map<uint64_t, std::vector<uint64_t>> hash_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocator type was lost.
|
|
||
| for (; range.first != range.second; ++range.first) | ||
| range.first->second |= MSB; | ||
| // Add all values to newMatches and mark them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to add all values here? We added only one value before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look just below, the values were/are added there.
| return std::unique_ptr<HashTable>(new HashTable(bucketCount, TupleJoiner::hasher(), | ||
| typename HashTable::key_equal(), | ||
| utils::STLPoolAllocator<typename HashTable::value_type>(resourceManager))); | ||
| // boost::unordered_flat_map doesn't need bucket_count or allocator in constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::ordered_map also doesn't need it, but we use custom allocator to track memory usage
|
There is a LOT of code like auto it = destHashTbl->find(e.second);
if (it != destHashTbl->end())
{
it->second.push_back(e.first);
}
else
{
std::vector<typename element_t::first_type> vec;
vec.push_back(e.first);
destHashTbl->emplace(e.second, std::move(vec));
}It would be nice to isolate this boilerplate somewhere |
mariadb-SergeyZefirov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider PR as good modulo Alexei's request, where the most important is notice that allocator type has been lost.
No benchmarks yet. Any ideas like is it good idea to replace std::vector with small_vector or any other is appreciated