Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive proposals #39

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
e9fd602
adaptive proposal
Nov 22, 2020
5f0ddfa
Adaptive univariate proposals tests
Nov 22, 2020
f42b784
swp files from vim...
Nov 22, 2020
a188780
test ESS
Nov 22, 2020
d8989aa
export AdaptiveProposal
Nov 22, 2020
565f12a
Update src/adaptive.jl
arzwa Nov 22, 2020
cc16195
Update src/adaptive.jl
arzwa Nov 22, 2020
802ec67
Update src/adaptive.jl
arzwa Nov 22, 2020
c7623c4
Update src/adaptive.jl
arzwa Nov 22, 2020
a33937e
Update src/adaptive.jl
arzwa Nov 22, 2020
4007bd0
Update src/adaptive.jl
arzwa Nov 22, 2020
71e010b
Update src/adaptive.jl
arzwa Nov 22, 2020
2d59ede
Update src/adaptive.jl
arzwa Nov 22, 2020
279aea7
Update src/adaptive.jl
arzwa Nov 22, 2020
93f17c5
Update src/adaptive.jl
arzwa Nov 23, 2020
16715e1
Update src/adaptive.jl
arzwa Nov 23, 2020
046c21b
Update src/adaptive.jl
arzwa Nov 23, 2020
b91fcc0
test update
Nov 23, 2020
387eff4
Merge branch 'adaptive' of https://github.com/arzwa/AdvancedMH.jl int…
Nov 23, 2020
8fb1048
Adaptor docstring
Nov 23, 2020
4071675
Update test/runtests.jl
arzwa Nov 23, 2020
a63262d
updated tests
Nov 23, 2020
afd3ed1
updated tests
Nov 23, 2020
fe8562c
Merge branch 'adaptive' of https://github.com/arzwa/AdvancedMH.jl int…
Nov 23, 2020
f66f647
adaptive MvNormal
Dec 10, 2020
2988ff2
merge with master
Dec 11, 2020
e5ad041
AdaptiveMvNormal with tests
Dec 11, 2020
7aa8631
Update src/adaptivemvnormal.jl
arzwa Dec 11, 2020
4999349
Update src/adaptivemvnormal.jl
arzwa Dec 11, 2020
d4b3f6b
Update src/adaptivemvnormal.jl
arzwa Dec 11, 2020
cb52c7f
Update test/Project.toml
arzwa Dec 11, 2020
5a2a175
Update src/mh-core.jl
arzwa Dec 11, 2020
b2be967
Update src/adaptivemvnormal.jl
arzwa Dec 11, 2020
518aab1
removed test Project.toml
Dec 11, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/AdvancedMH.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import Random

# Exports
export MetropolisHastings, DensityModel, RWMH, StaticMH, StaticProposal,
RandomWalkProposal, Ensemble, StretchProposal
RandomWalkProposal, Ensemble, StretchProposal, AdaptiveProposal

# Reexports
export sample, MCMCThreads, MCMCDistributed
Expand Down Expand Up @@ -104,5 +104,6 @@ end
include("proposal.jl")
include("mh-core.jl")
include("emcee.jl")
include("adaptive.jl")

end # module AdvancedMH
93 changes: 93 additions & 0 deletions src/adaptive.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
mutable struct Adaptor
accepted::Integer
total ::Integer
tune ::Integer # tuning interval
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible one should avoid non-concrete fields:

Suggested change
mutable struct Adaptor
accepted::Integer
total ::Integer
tune ::Integer # tuning interval
mutable struct Adaptor
accepted::Int
total::Int
tune::Int # tuning interval

On a more general level, I'm not completely sure if it is useful to have a separate Adaptor struct, it seems it could just be integrated into AdaptiveProposal.

On an even more general level, I think it would be better to make this part of the state of the sampler using the AbstractMCMC interface instead of a field of the proposal. With the current design, the proposal will be mutated in every step. However, this (IMO preferred) design requires to implement AbstractMCMC.step instead of just adding the accept! call.

Copy link
Author

@arzwa arzwa Nov 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course, Int <-> Integer confusion...

The Adaptor struct may indeed be superfluous, although I found it a bit clearer separated that way. Also, I considered implementing adaptation for multivariate Normal proposals, which uses a different machinery under the hood, and my initial thought was to implement that as an AdaptiveProposal but with different Adaptor type. Of course, that could be implemented as another proposal struct altogether.

I think I understand conceptually your preferred design at the step level, although ATM my insight in how AbstractMCMC works is insufficient to see how that should be done, and currently, to me the mutation of the proposals is the most intuitive approach to implement adaptation. The accept! call seemed like a very simple, but admittedly somewhat hacky, way to enable adaptive proposals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should punt this problem to a later date. I would like to include accept/reject as a field in the Transition struct, which would make it very easy to count the number of previous acceptances by just adding adding one to the total_acceptances field in a Transition. Currently AdvancedMH doesn't track that internally, but I can just modify this code to remove the mutation later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only about the number of accepted/rejected steps here though, the state would have to include the updated proposal etc as well, so it won't be solved by including the stats in Transition.

However, I'm fine with postponing this refactoring. Probably best to open an issue so we don't forget it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my thinking, you'd add an extra field to Transitions that just accumulates the total number of acceptances, which is easier to get when you have individual acceptances for each draw. I'll open an issue up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand (and I think that's a good addition). My point was just that it is not sufficient here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened an issue (#40).

target ::Float64 # target acceptance rate
bound ::Float64 # bound on logσ of Gaussian kernel
δmax ::Float64 # maximum adaptation step
arzwa marked this conversation as resolved.
Show resolved Hide resolved
end

Adaptor(; tune=25, target=0.44, bound=10., δmax=0.2) =
Adaptor(0, 0, tune, target, bound, δmax)
arzwa marked this conversation as resolved.
Show resolved Hide resolved

"""
AdaptiveProposal{P}

An adaptive Metropolis-Hastings proposal. In order for this to work, the
proposal kernel should implement the `adapted(proposal, δ)` method, where `δ`
is the increment/decrement applied to the scale of the proposal distribution
during adaptation (e.g. for a Normal distribution the scale is `log(σ)`, so
that after adaptation the proposal is `Normal(0, exp(log(σ) + δ))`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly describe the default Adaptor in the docstring for AdaptiveProposal, and notify everyone what the default keywords are for the adaptor?


# Example
```julia
julia> p = AdaptiveProposal(Uniform(-0.2, 0.2));

julia> rand(p)
0.07975590594518434
```

# References

Roberts, Gareth O., and Jeffrey S. Rosenthal. "Examples of adaptive MCMC."
Journal of Computational and Graphical Statistics 18.2 (2009): 349-367.
"""
mutable struct AdaptiveProposal{P} <: Proposal{P}
proposal::P
adaptor ::Adaptor
arzwa marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written above, one might want to consider moving the fields from Adaptor here and maybe even better making it part of a state that is passed to the proposal in every sampling step. Also the updated proposal would be part of a separate state.

end

function AdaptiveProposal(p; kwargs...)
AdaptiveProposal(p, Adaptor(; kwargs...))
arzwa marked this conversation as resolved.
Show resolved Hide resolved
end

accepted!(p::AdaptiveProposal) = p.adaptor.accepted += 1
accepted!(p::Vector{<:AdaptiveProposal}) = map(accepted!, p)
accepted!(p::NamedTuple{names}) where names = map(x->accepted!(getfield(p, x)), names)

# this is defined because the first draw has no transition yet (I think)
propose(rng::Random.AbstractRNG, p::AdaptiveProposal, m::DensityModel) =
rand(rng, p.proposal)
arzwa marked this conversation as resolved.
Show resolved Hide resolved

# the actual proposal happens here
function propose(
rng::Random.AbstractRNG,
proposal::AdaptiveProposal{<:Union{Distribution,Proposal}},
model::DensityModel,
t
)
consider_adaptation!(proposal)
t + rand(rng, proposal.proposal)
arzwa marked this conversation as resolved.
Show resolved Hide resolved
end

function q(proposal::AdaptiveProposal, t, t_cond)
logpdf(proposal, t - t_cond)
arzwa marked this conversation as resolved.
Show resolved Hide resolved
end

function consider_adaptation!(p)
(p.adaptor.total % p.adaptor.tune == 0) && adapt!(p)
p.adaptor.total += 1
end

function adapt!(p::AdaptiveProposal)
a = p.adaptor
a.total == 0 && return
δ = min(a.δmax, 1. /√(a.total/a.tune)) # diminishing adaptation
arzwa marked this conversation as resolved.
Show resolved Hide resolved
α = a.accepted / a.tune # acceptance ratio
p_ = adapted(p.proposal, α > a.target ? δ : -δ, a.bound)
a.accepted = 0
p.proposal = p_
end

function adapted(d::Normal, δ, bound=Inf)
lσ = log(d.σ) + δ
lσ = abs(lσ) > bound ? sign(lσ) * bound : lσ
arzwa marked this conversation as resolved.
Show resolved Hide resolved
Normal(d.μ, exp(lσ))
arzwa marked this conversation as resolved.
Show resolved Hide resolved
end

function adapted(d::Uniform, δ, bound=Inf)
lσ = log(d.b) + δ
σ = abs(lσ) > bound ? exp(sign(lσ) * bound) : exp(lσ)
arzwa marked this conversation as resolved.
Show resolved Hide resolved
Uniform(-σ, σ)
arzwa marked this conversation as resolved.
Show resolved Hide resolved
end

4 changes: 4 additions & 0 deletions src/mh-core.jl
Original file line number Diff line number Diff line change
Expand Up @@ -208,8 +208,12 @@ function AbstractMCMC.step(

# Decide whether to return the previous params or the new one.
if -Random.randexp(rng) < logα
accepted!(spl.proposal)
return params, params
else
return params_prev, params_prev
end
end

function accepted!(p::P) where P<:Proposal end

arzwa marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion src/proposal.jl
Original file line number Diff line number Diff line change
Expand Up @@ -103,4 +103,4 @@ function q(
t_cond
)
return q(proposal(t_cond), t, t_cond)
end
end
28 changes: 27 additions & 1 deletion test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ using Test
Random.seed!(1234)

# Generate a set of data from the posterior we want to estimate.
data = rand(Normal(0, 1), 300)
data = rand(Normal(0., 1), 300)
arzwa marked this conversation as resolved.
Show resolved Hide resolved

# Define the components of a basic model.
insupport(θ) = θ[2] >= 0
Expand Down Expand Up @@ -52,6 +52,32 @@ using Test
@test mean(chain2.μ) ≈ 0.0 atol=0.1
@test mean(chain2.σ) ≈ 1.0 atol=0.1
end

@testset "Adaptive random walk" begin
# Set up our sampler with initial parameters.
spl1 = MetropolisHastings([AdaptiveProposal(Normal(0,.4)), AdaptiveProposal(Normal(0,1.2))])
spl2 = MetropolisHastings((μ=AdaptiveProposal(Normal(0,1.4)), σ=AdaptiveProposal(Normal(0,0.2))))

# Sample from the posterior.
chain1 = sample(model, spl1, 100000; chain_type=StructArray, param_names=["μ", "σ"])
chain2 = sample(model, spl2, 100000; chain_type=StructArray, param_names=["μ", "σ"])

# chn_mean ≈ dist_mean atol=atol_v
@test mean(chain1.μ) ≈ 0.0 atol=0.1
@test mean(chain1.σ) ≈ 1.0 atol=0.1
@test mean(chain2.μ) ≈ 0.0 atol=0.1
@test mean(chain2.σ) ≈ 1.0 atol=0.1
end

@testset "Compare adaptive to simple random walk" begin
data = rand(Normal(2., 1.), 500)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arzwa This might be the problem - you redefine data and hence I assume you implicitly change the density model which is defined as a closure over data.

You could just rename the variable here but actually I think the better approach might be to "fix" the data in the model to avoid any such surprises in the future.

I guess this can be achieved by defining

density = let data = data
  θ -> insupport(θ) ? sum(logpdf.(dist(θ), data)) : -Inf
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I haven't tested it, so make sure it actually fixes the problem 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just saw this too, thanks. I'll check and push an updated test suite. (Actually, we could just as well test against the same data defined above in the test suite, but I find testing against a mean different from 0 a bit more reassuring since the sampler actually has to 'move' to somewhere from where it starts).

m1 = DensityModel(x -> loglikelihood(Normal(x,1), data))
p1 = RandomWalkProposal(Normal())
p2 = AdaptiveProposal(Normal())
c1 = sample(m1, MetropolisHastings(p1), 10000; chain_type=Chains)
c2 = sample(m1, MetropolisHastings(p2), 10000; chain_type=Chains)
@test ess(c2).nt.ess > ess(c1).nt.ess
end

@testset "parallel sampling" begin
spl1 = StaticMH([Normal(0,1), Normal(0, 1)])
Expand Down