Author: | Jonas El Gammal, Jesus Torrado, Nils Schoeneberg and Christian Fidler |
---|---|
Source: | Source code on GitHub |
Documentation: | Documentation on Read the Docs |
License: | LGPL + mandatory bug reporting asap + mandatory arXiv'ing of publications using it (see LICENSE for exceptions). The documentation is licensed under the GFDL. |
Support: | For questions drop me an email. For issues/bugs please use GitHub's Issues. |
Installation: | pip install gpry (for MPI and nested samplers, see here) |
GPry is a drop-in alternative to traditional Monte Carlo samplers (such as MCMC or Nested Sampling), for likelihood-based inference. It is aimed at speeding up posterior exploration and inference of marginal quantities from computationally expensive likelihoods, reducing the cost of inference by a factor of 100 or more.
GPry can be installed with pip (python -m pip install gpry
), and needs only a callable likelihood and some bounds:
def log_likelihood(x, y):
return [...]
bounds = [[..., ...], [..., ...]]
from gpry import Runner
runner = Runner(log_likelihood, bounds, checkpoint="output/")
runner.run()
An interface to the Cobaya sampler is available, for richer model especification, and direct access to some physical likelihood pipelines.
GPry was developed as part of J. El Gammal's M.Sc. and Ph.D. thesis projects.
GPry uses a Gaussian Process (GP) to create an interpolating model of the log-posterior density function, using as few evaluations as possible. It achieves that using active learning: starting from a minimal set of training samples, the next ones are chosen so that they maximise the information gained on the posterior shape. For more details, see section How GPry works of the documentation, and check out the GPry papers (see below).
GPry introduces some innovations with respect to previous similar approaches:
- It imposes weakly-informative priors on the target function, based on a comparison with an n-dimensional Gaussian, and uses that information e.g. for convergence metrics, balancing exploration vs. exploitation, etc.
- It introduces a parallelizable batch acquisition algorithm (NORA) which increases robustness, reduces overhead and enables the evaluation of the likelihood/posterior in parallel using multiple cores.
- Complementing the GP model, it implements an SVM classifier that learns the shape of uninteresting regions, where proposal are discarded, wherever the value of the likelihood is very-low (for increased efficiency) or undefined (for increased robustness).
At the moment, GPry utilizes a modification of the CPU-based scikit-learn GP implementation.
- Non-stochastic log-probability density functions, smooth up to a small amount of (deterministic) numerical noise (less than 0.1 in log posterior).
- Large evaluation times, so that the GPry overhead is subdominant with respect to posterior evaluation. How slow depends on the number of dimensions and expected shape of the posterior distribution but as a rule of thumb, if an MCMC takes longer to converge than you're willing to wait you should give it a shot.
- The parameter space needs to be low-dimensional (less than 20 as a rule of thumb). In higher dimensions you might still gain considerable improvements in speed if your likelihood is sufficiently slow but the computational overhead of the algorithm increases considerably.
What may not work so well:
- Highly multimodal posteriors, especially if the separation between modes is large.
- Highly non-Gaussian posteriors, that would not be well modelled by orthogonal constant correlation lengths.
GPry is under active developing, in order to mitigate some of those issues, so look out for new versions!
Please check out the Strategy and Troubleshooting page, or get in touch for issues or more general discussions.
If you use GPry, please cite the following papers:
- arXiv:2211.02045 for the core algorithm.
- arXiv:2305.19267 for the NORA Nested-Sampling acquisition engine.