[Merged by Bors] - Remove ModelGen #134

devmotion · 2020-06-12T14:29:07Z

This PR removes ModelGen completely. The main motivation for it were the issues that @itsdfish and I experienced when working with multiple processes. The following MWE

using Distributed
addprocs(4)
@everywhere using DynamicPPL

@everywhere @model model() = begin end

pmap(x -> model(), 1:4)

fails intermittently

if not all of these [@model] evaluations generate same evaluator and generator functions of the same name (i.e., these var"###evaluator#253" and var"###generator#254" functions). I assume one could maybe provoke the error by defining another model first on the main process before calling @Everywhere @model ....

(copied from the discussion on Slack)

With the changes in this PR, @model model() = ... generates only a single function model on all workers, and hence there are no issues anymore with the names of the generators and evaluators. The evaluator is created as a closure inside of model, which can be serialized and deserialized properly by Julia. So far I haven't been able to reproduce the issues above with this PR.

The only user-facing change of the removal of ModelGen (apart from that one never has to construct it, which simplifies the docs and the example @denainjs asked about) is that the logprob macro now requires to specify model = m where m = model() instead of model = model (since that's just a regular function from which the default arguments etc of the resulting model can't be extracted). It feels slightly weird that the evaluation is not based "exactly" on the specified Model instance but that the other parts of logprob modify it (which was the reason I guess for using the model generator before here), but on the other hand this weird behaviour already exists when specifying logprob with a Chains object. (BTW I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.)

Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091).

itsdfish · 2020-06-12T14:42:28Z

Thank you for looking into this! It looks like this change will simplify several things.

src/context_implementations.jl

phipsgabler

How cool is that! 🍰

src/model.jl

src/compiler.jl

yebai

There are some really nice improvements in this PR.

I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.)

I agree the string macro is a bit strange and would be nice to consider a regular function replacement.

Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091).

Would be great to experiment with these ideas and try to tidy up the compiler API/design.

Project.toml

phipsgabler · 2020-06-24T09:39:55Z

I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.)

I agree the string macro is a bit strange and would be nice to consider a regular function replacement.

I have never used it, but also wondered about that choice. We could use some kind of formula objects, like GLM does, although "formula" is a strange name and we shouldn't reuse the tilde here. And unfortunately, | binds stronger than assignment and comma, so @prob(x = 1 | y = 2) parses as (= x (= (| 1 y) 2)).

On the other hand, wouldn't something like

density(model, s = 1.0, m = 1.0, given = (;x = 1.0), chain = c)

suffice to analyze which variables go where and perform the according call?

devmotion · 2020-06-24T12:20:21Z

Regarding the precedence issue, one could use, e.g., @logprob (x = 1) | (y = 3, z = 24) which seems to parse correctly.

In fact that would be trivial to implement it seems, since it could just be lowered to the logprob function in

DynamicPPL.jl/src/prob_macro.jl

Line 24 in 275ccc8

function logprob(ex1, ex2)

. It seems the logprob"..." macro just wraps the LHS and RHS of | as tuples and then calls logprob, so it's already like the density suggestion with the model and chain being part of the RHS.

IMO it might be more natural to keep the model and the chain that is used in these computations separate from the LHS and RHS of | since it can only be part of the RHS anyways. Similar to the density suggestion, one could just require the model and/or the chain to be the first argument of density/logprob. That would also avoid issues if a variable is called chain or model. Maybe it would also resolve the current precedence issues (at least there were some discussions and issues a while ago IIRC) about what to do when both a chain with a saved model and a model are specified since we could define the function signature and provided dispatches at will. In the same way the @logprob macro could be extended with additional arguments that are not part of the tuples.

phipsgabler · 2020-06-24T13:27:40Z

Regarding the precedence issue, one could use, e.g., @logprob (x = 1) | (y = 3, z = 24) which seems to parse correctly.

Sure, but that would 1) force the user to use parentheses in a place where "semantically" the form without parentheses should be equally valid, and 2) will hit people with the subtle distinction between @macro (foo) and @macro(foo). That's why I didn't suggest that form.

Now I get why this was made a string macro :D

In fact that would be trivial to implement it seems, since it could just be lowered to the logprob function [...]. It seems the logprob"..." macro just wraps the LHS and RHS of | as tuples and then calls logprob, so it's already like the density suggestion with the model and chain being part of the RHS.

IMO it might be more natural to keep the model and the chain that is used in these computations separate from the LHS and RHS of | since it can only be part of the RHS anyways. Similar to the density suggestion, one could just require the model and/or the chain to be the first argument of density/logprob.

Exactly my thought.

New suggestion for sugar: using ; for conditioning, make @pr construct a "probability formula", somewhat like the mathematical notation Pr(p1(x), p2(y); z), where we deal with predicates of x and y, and pass this to logprob and logdensity functions, just that we always have to be explicit about names:

logprob(model, @pr(x == 1; alpha = 2))
logprob(chain, @pr y > 0)

(And yes, I realize that this suffers from the same @macro (foo) problem :( )

One thing I'm just not sure about is how rigidly we should separate densities and probabilities. That's why I called the function density. I'd rather not further propagate the confusion many people have distinguishing between them (and likelihoods) by calling them the same.

devmotion · 2020-06-24T17:46:42Z

I guess the main motivation for a macro over just having a function logdensity(model, ::Tuple, ::Tuple) would be that it might be more convenient for specifying the predicates?

devmotion · 2020-06-24T17:48:42Z

BTW (since I just worked with it) the macro syntax with brackets and | is also used in the pattern matching in MacroTools: https://mikeinnes.github.io/MacroTools.jl/stable/pattern-matching/#Unions-1

devmotion · 2020-06-24T18:19:10Z

Ah I just noticed that the latest commits broke the MWE above - one really has to use anonymous functions it seems. Unfortunately that means that currently we either have to add ExprTools as additional dependency or reimplement the logic in MacroTools master (I asked about a new release but I don't know how soon there'll be one). I guess it would be fine to copy the parts of MacroTools that we need right now and remove them if a new release is available?

yebai · 2020-06-24T18:30:22Z

Sounds good.

devmotion · 2020-06-26T10:37:40Z

This is ready for a second round of reviewing @yebai @phipsgabler

I removed ExprTools and copied the logic for combining anonymous functions from MacroTools (hopefully there's a release soon, then it can be removed again; I would also prefer actually if they merge ExprTools into MacroTools). I also simplified the code for extracting the arguments and their default values and used MacroTools.@match for increasing the readability when parsing expressions. If we stick with MacroTools for now, #121 might not be needed anymore (or only in a reduced form).

phipsgabler · 2020-06-26T12:59:31Z

Very nice! I couldn't find any flaws in reading the code. build_model_info got really slim, I agree that #121 might have become superfluous with this.

yebai · 2020-06-29T17:33:32Z

bors r+

@itsdfish

This PR removes `ModelGen` completely. The main motivation for it were the issues that @itsdfish and I experienced when working with multiple processes. The following MWE ```julia using Distributed addprocs(4) @Everywhere using DynamicPPL @Everywhere @model model() = begin end pmap(x -> model(), 1:4) ``` fails intermittently > if not all of these [`@model`] evaluations generate same evaluator and generator functions of the same name (i.e., these var"###evaluator#253" and var"###generator#254" functions). I assume one could maybe provoke the error by defining another model first on the main process before calling @Everywhere @model .... (copied from the discussion on Slack) With the changes in this PR, `@model model() = ...` generates only a single function `model` on all workers, and hence there are no issues anymore with the names of the generators and evaluators. The evaluator is created as a closure inside of `model`, which can be serialized and deserialized properly by Julia. So far I haven't been able to reproduce the issues above with this PR. The only user-facing change of the removal of `ModelGen` (apart from that one never has to construct it, which simplifies the docs and the example @denainjs asked about) is that the `logprob` macro now requires to specify `model = m` where `m = model()` instead of `model = model` (since that's just a regular function from which the default arguments etc of the resulting model can't be extracted). It feels slightly weird that the evaluation is not based "exactly" on the specified `Model` instance but that the other parts of `logprob` modify it (which was the reason I guess for using the model generator before here), but on the other hand this weird behaviour already exists when specifying `logprob` with a `Chains` object. (BTW I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.) Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091). Co-authored-by: David Widmann <[email protected]>

bors · 2020-06-29T17:34:26Z

Build failed:

test (1, ubuntu-latest, x64)

yebai · 2020-06-29T17:35:56Z

bors r+

@itsdfish

This PR removes `ModelGen` completely. The main motivation for it were the issues that @itsdfish and I experienced when working with multiple processes. The following MWE ```julia using Distributed addprocs(4) @Everywhere using DynamicPPL @Everywhere @model model() = begin end pmap(x -> model(), 1:4) ``` fails intermittently > if not all of these [`@model`] evaluations generate same evaluator and generator functions of the same name (i.e., these var"###evaluator#253" and var"###generator#254" functions). I assume one could maybe provoke the error by defining another model first on the main process before calling @Everywhere @model .... (copied from the discussion on Slack) With the changes in this PR, `@model model() = ...` generates only a single function `model` on all workers, and hence there are no issues anymore with the names of the generators and evaluators. The evaluator is created as a closure inside of `model`, which can be serialized and deserialized properly by Julia. So far I haven't been able to reproduce the issues above with this PR. The only user-facing change of the removal of `ModelGen` (apart from that one never has to construct it, which simplifies the docs and the example @denainjs asked about) is that the `logprob` macro now requires to specify `model = m` where `m = model()` instead of `model = model` (since that's just a regular function from which the default arguments etc of the resulting model can't be extracted). It feels slightly weird that the evaluation is not based "exactly" on the specified `Model` instance but that the other parts of `logprob` modify it (which was the reason I guess for using the model generator before here), but on the other hand this weird behaviour already exists when specifying `logprob` with a `Chains` object. (BTW I'm not sure if we should actually use string macros here, maybe regular functions would be nicer.) Additionally, I assume (though haven't tested it) that getting rid of the separate evaluator and generator functions will not only simplify serialization and deserialization when working with multiple processes but also when saving models and chains (see e.g. TuringLang/Turing.jl#1091). Co-authored-by: David Widmann <[email protected]>

bors · 2020-06-29T18:23:51Z

Pull request successfully merged into dev.

Build succeeded:

devmotion requested review from phipsgabler, mohamed82008 and yebai June 14, 2020 20:03

yebai reviewed Jun 15, 2020

View reviewed changes

src/context_implementations.jl Outdated Show resolved Hide resolved

devmotion force-pushed the modelgen branch from 3e961c1 to bf1e9c5 Compare June 15, 2020 21:57

phipsgabler approved these changes Jun 17, 2020

View reviewed changes

src/model.jl Outdated Show resolved Hide resolved

src/model.jl Show resolved Hide resolved

src/compiler.jl Outdated Show resolved Hide resolved

devmotion added 4 commits June 18, 2020 13:01

Remove ModelGen

103fa58

Remove tabs

aff980a

Document constructor

c6a9f51

Use ExprTools.combinedef for building a correctly typed evaluator

67b5873

devmotion force-pushed the modelgen branch from 28b55df to 67b5873 Compare June 18, 2020 11:50

yebai reviewed Jun 23, 2020

View reviewed changes

Project.toml Outdated Show resolved Hide resolved

devmotion added 2 commits June 24, 2020 18:59

Merge dev and remove ExprTools

9ad0ba5

Use @match a bit more

29aafef

devmotion and others added 3 commits June 24, 2020 23:29

Make evaluator an anonymous function

da74411

Add serialization tests

470e14a

Fix test on Julia 1.0

c35f8fc

bors bot changed the title ~~Remove ModelGen~~ [Merged by Bors] - Remove ModelGen Jun 29, 2020

bors bot closed this Jun 29, 2020

bors bot deleted the modelgen branch June 29, 2020 18:23

yebai mentioned this pull request Jul 20, 2020

Replace MacroTools with ExprTools #121

Closed

devmotion mentioned this pull request Aug 25, 2024

How to save a model / fit and load it? Issue with JLD2 for "reconstructing" TuringLang/Turing.jl#2309

Closed

[Merged by Bors] - Remove ModelGen #134

[Merged by Bors] - Remove ModelGen #134

Uh oh!

Conversation

devmotion commented Jun 12, 2020

Uh oh!

itsdfish commented Jun 12, 2020

Uh oh!

Uh oh!

phipsgabler left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yebai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

phipsgabler commented Jun 24, 2020

Uh oh!

devmotion commented Jun 24, 2020

Uh oh!

phipsgabler commented Jun 24, 2020

Uh oh!

devmotion commented Jun 24, 2020

Uh oh!

devmotion commented Jun 24, 2020

Uh oh!

devmotion commented Jun 24, 2020

Uh oh!

yebai commented Jun 24, 2020

Uh oh!

devmotion commented Jun 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phipsgabler commented Jun 26, 2020

Uh oh!

yebai commented Jun 29, 2020

Uh oh!

bors bot commented Jun 29, 2020

Uh oh!

yebai commented Jun 29, 2020

Uh oh!

bors bot commented Jun 29, 2020

Uh oh!

Uh oh!

devmotion commented Jun 26, 2020 •

edited

Loading