-
Notifications
You must be signed in to change notification settings - Fork 35
[Merged by Bors] - Remove ModelGen #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for looking into this! It looks like this change will simplify several things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How cool is that! 🍰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some really nice improvements in this PR.
I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.)
I agree the string macro is a bit strange and would be nice to consider a regular function replacement.
Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091).
Would be great to experiment with these ideas and try to tidy up the compiler API/design.
I have never used it, but also wondered about that choice. We could use some kind of formula objects, like GLM does, although "formula" is a strange name and we shouldn't reuse the tilde here. And unfortunately, On the other hand, wouldn't something like density(model, s = 1.0, m = 1.0, given = (;x = 1.0), chain = c) suffice to analyze which variables go where and perform the according call? |
Regarding the precedence issue, one could use, e.g., In fact that would be trivial to implement it seems, since it could just be lowered to the DynamicPPL.jl/src/prob_macro.jl Line 24 in 275ccc8
logprob"..." macro just wraps the LHS and RHS of | as tuples and then calls logprob , so it's already like the density suggestion with the model and chain being part of the RHS.
IMO it might be more natural to keep the |
Sure, but that would 1) force the user to use parentheses in a place where "semantically" the form without parentheses should be equally valid, and 2) will hit people with the subtle distinction between Now I get why this was made a string macro :D
Exactly my thought. New suggestion for sugar: using logprob(model, @pr(x == 1; alpha = 2))
logprob(chain, @pr y > 0) (And yes, I realize that this suffers from the same One thing I'm just not sure about is how rigidly we should separate densities and probabilities. That's why I called the function |
I guess the main motivation for a macro over just having a function |
BTW (since I just worked with it) the macro syntax with brackets and |
Ah I just noticed that the latest commits broke the MWE above - one really has to use anonymous functions it seems. Unfortunately that means that currently we either have to add ExprTools as additional dependency or reimplement the logic in MacroTools master (I asked about a new release but I don't know how soon there'll be one). I guess it would be fine to copy the parts of MacroTools that we need right now and remove them if a new release is available? |
Sounds good. |
This is ready for a second round of reviewing @yebai @phipsgabler I removed ExprTools and copied the logic for combining anonymous functions from MacroTools (hopefully there's a release soon, then it can be removed again; I would also prefer actually if they merge ExprTools into MacroTools). I also simplified the code for extracting the arguments and their default values and used |
Very nice! I couldn't find any flaws in reading the code. |
bors r+ |
This PR removes `ModelGen` completely. The main motivation for it were the issues that @itsdfish and I experienced when working with multiple processes. The following MWE ```julia using Distributed addprocs(4) @Everywhere using DynamicPPL @Everywhere @model model() = begin end pmap(x -> model(), 1:4) ``` fails intermittently > if not all of these [`@model`] evaluations generate same evaluator and generator functions of the same name (i.e., these var"###evaluator#253" and var"###generator#254" functions). I assume one could maybe provoke the error by defining another model first on the main process before calling @Everywhere @model .... (copied from the discussion on Slack) With the changes in this PR, `@model model() = ...` generates only a single function `model` on all workers, and hence there are no issues anymore with the names of the generators and evaluators. The evaluator is created as a closure inside of `model`, which can be serialized and deserialized properly by Julia. So far I haven't been able to reproduce the issues above with this PR. The only user-facing change of the removal of `ModelGen` (apart from that one never has to construct it, which simplifies the docs and the example @denainjs asked about) is that the `logprob` macro now requires to specify `model = m` where `m = model()` instead of `model = model` (since that's just a regular function from which the default arguments etc of the resulting model can't be extracted). It feels slightly weird that the evaluation is not based "exactly" on the specified `Model` instance but that the other parts of `logprob` modify it (which was the reason I guess for using the model generator before here), but on the other hand this weird behaviour already exists when specifying `logprob` with a `Chains` object. (BTW I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.) Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091). Co-authored-by: David Widmann <[email protected]>
Build failed: |
bors r+ |
This PR removes `ModelGen` completely. The main motivation for it were the issues that @itsdfish and I experienced when working with multiple processes. The following MWE ```julia using Distributed addprocs(4) @Everywhere using DynamicPPL @Everywhere @model model() = begin end pmap(x -> model(), 1:4) ``` fails intermittently > if not all of these [`@model`] evaluations generate same evaluator and generator functions of the same name (i.e., these var"###evaluator#253" and var"###generator#254" functions). I assume one could maybe provoke the error by defining another model first on the main process before calling @Everywhere @model .... (copied from the discussion on Slack) With the changes in this PR, `@model model() = ...` generates only a single function `model` on all workers, and hence there are no issues anymore with the names of the generators and evaluators. The evaluator is created as a closure inside of `model`, which can be serialized and deserialized properly by Julia. So far I haven't been able to reproduce the issues above with this PR. The only user-facing change of the removal of `ModelGen` (apart from that one never has to construct it, which simplifies the docs and the example @denainjs asked about) is that the `logprob` macro now requires to specify `model = m` where `m = model()` instead of `model = model` (since that's just a regular function from which the default arguments etc of the resulting model can't be extracted). It feels slightly weird that the evaluation is not based "exactly" on the specified `Model` instance but that the other parts of `logprob` modify it (which was the reason I guess for using the model generator before here), but on the other hand this weird behaviour already exists when specifying `logprob` with a `Chains` object. (BTW I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.) Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091). Co-authored-by: David Widmann <[email protected]>
Pull request successfully merged into dev. Build succeeded:
|
This PR removes
ModelGen
completely. The main motivation for it were the issues that @itsdfish and I experienced when working with multiple processes. The following MWEfails intermittently
(copied from the discussion on Slack)
With the changes in this PR,
@model model() = ...
generates only a single functionmodel
on all workers, and hence there are no issues anymore with the names of the generators and evaluators. The evaluator is created as a closure inside ofmodel
, which can be serialized and deserialized properly by Julia. So far I haven't been able to reproduce the issues above with this PR.The only user-facing change of the removal of
ModelGen
(apart from that one never has to construct it, which simplifies the docs and the example @denainjs asked about) is that thelogprob
macro now requires to specifymodel = m
wherem = model()
instead ofmodel = model
(since that's just a regular function from which the default arguments etc of the resulting model can't be extracted). It feels slightly weird that the evaluation is not based "exactly" on the specifiedModel
instance but that the other parts oflogprob
modify it (which was the reason I guess for using the model generator before here), but on the other hand this weird behaviour already exists when specifyinglogprob
with aChains
object. (BTW I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.)Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091).