-
Notifications
You must be signed in to change notification settings - Fork 27
Added a distributed alias table to YGM. #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v0.9-dev
Are you sure you want to change the base?
Conversation
Preparing v0.6 Release (llnl#200)
steiltre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend making use of concepts in the way they are used in existing container base classes. We can decide where to move these concepts and new ones to add independent of this PR.
The existing constructors look good, but constructors that work from STL containers should be added as well.
Updating master with ygm 0.8.
… clean up the use of concepts.
…m_engine to ygm::random::random_engine
…M value container
…emTuple concepts that take as input for_all_args type. Also added constructor that builds the alias table from an STL container
…ogic. Wrote test for various constructors and tested sampling accuracy via sampling words from corpus and comparing to ground truth frequency
…n handle a variety of distributions
…the average local weight is no longer 1. This was done to decrease the errors associated with inexact floating point operations
… showed this has fixed the bug, but consistently reproducing the bug has proved to be difficult, so bug might still be present
include/ygm/random/alias_table.hpp
Outdated
| item item_to_send = {local_item.id, weight_to_send}; | ||
| items_to_send.push_back(item_to_send); | ||
|
|
||
| if ((curr_weight > 1e-8) && (dest_rank < m_comm.size())) { // Accounts for rounding errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cause any slight errors in results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the curr_weight > 1e-8 condition.
include/ygm/random/alias_table.hpp
Outdated
| } | ||
|
|
||
| // Need to handle items left in items to send. Must also account for floating point errors. | ||
| if (items_to_send.size() > 0 && curr_weight > 1e-8 && dest_rank < m_comm.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does weight lost here affect results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the curr_weight > 1e-8 condition. This condition has the potential to create bigger errors than that caused by floating point arithmetic inaccuracies.
I still have to verify that the destination being sent to is less than the number of ranks because it is possible a floating point error will cause there to be more weight and therefore attempt to spread it to a rank > than m_comm.size().
include/ygm/random/alias_table.hpp
Outdated
| ygm::container::detail::SingleItemTuple<typename YGMContainer::for_all_args> && | ||
| pair_like_and_convertible_to_weighted_item<Item, | ||
| std::tuple_element_t<0,typename YGMContainer::for_all_args>> | ||
| alias_table(ygm::comm &comm, RNG &rng, YGMContainer &c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the constructor take an RNG object that is stored as a reference feels a bit clunky.
Does it make sense to try any of the following:
- Take a seed for an RNG object that gets constructed by the alias table (allows seeding but disconnects external RNG object from what is used within
alias_table) - Create a randomly initialized RNG object when one is not given to the constructor (not sure how this would work with the RNG being stored as a reference)
Given m_rng is only needed in the async_sample function, it would be natural (and look more like std::sample) to have the RNG passed in to async_sample, but this is not practical given the communication required before m_rng is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with passing a seed or generating a random seed if a seed is not given. We may need to change this in the future to match the interface of some sort of ygm container sampling object (assuming we eventually create something like this).
I agree that passing m_rng to async_sample would be best, but as you mentioned, it is not practical.
… take a seed for the rng or create an rng with a rag with a random seed
alias_tabledata structure to ygm. Alias table data structure enables O(1) sampling time from a discrete distribution.alias_table::async_samplewhich requires the user provide a lambda that takes the item sampled as an argument.ygm::randomwhere thedefault_random_enginenow sits as well asalias_table