forked from barryclark/jekyll-now
-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
83 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
--- | ||
--- | ||
References | ||
========== | ||
@misc{baydin2018automaticdifferentiationmachinelearning, | ||
title={Automatic differentiation in machine learning: a survey}, | ||
author={Atilim Gunes Baydin and Barak A. Pearlmutter and Alexey Andreyevich Radul and Jeffrey Mark Siskind}, | ||
year={2018}, | ||
eprint={1502.05767}, | ||
archivePrefix={arXiv}, | ||
primaryClass={cs.SC}, | ||
url={https://arxiv.org/abs/1502.05767}, | ||
} | ||
|
||
@article{JMLR:v21:19-346, | ||
author = {Shakir Mohamed and Mihaela Rosca and Michael Figurnov and Andriy Mnih}, | ||
title = {Monte Carlo Gradient Estimation in Machine Learning}, | ||
journal = {Journal of Machine Learning Research}, | ||
year = {2020}, | ||
volume = {21}, | ||
number = {132}, | ||
pages = {1--62}, | ||
url = {http://jmlr.org/papers/v21/19-346.html} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,76 @@ | ||
--- | ||
title: "Automatic integration" | ||
subtitle: "Where is it?" | ||
layout: post | ||
categories: | ||
- "proposal" | ||
scholar: | ||
bibliography: "auto-int.bib" | ||
--- | ||
|
||
Within machine learning, automatic differentiation proves its awesome utility. | ||
Automatic differentiation has proven its awesome utility, especially within machine learning {% cite baydin2018automaticdifferentiationmachinelearning %}. | ||
Fasciliting the training of deep neural networks, it has become a staple of the field. | ||
|
||
However, as far as I know, there is no analog for automatic integration. | ||
Why not? | ||
|
||
*** | ||
|
||
<!-- uses / motivation --> | ||
|
||
Firstly, what kind of problems could automatic integration solve? | ||
|
||
<!-- bayesian --> | ||
Two distributions. | ||
|
||
$$ | ||
\begin{align} | ||
p(y| x) &= \frac{p(x| y) p(y)}{p(x)} \\ | ||
p(x) &= \int p(x| y) p(y) dy | ||
\end{align} | ||
$$ | ||
|
||
For example, we have a prior distribution over likely diseases $p(y)$ and a likelihood function $p(x| y)$ which tells us how likely the observed MRI $x$ is given the disease $y$. | ||
Where $p(y)$ is derived from the population statistics. And $p(x | y)$ is a generative model learning from data. | ||
<!-- y= discrete variable. 1000 different potential diseases --> | ||
<!-- y= continuous variable. how long are they likely to survive / how long until surgery is needed. measured in weeks. e.g 10.24 weeks. --> | ||
|
||
|
||
|
||
|
||
|
||
$$ | ||
\int p(x; \theta) f(x; \phi) dx | ||
$$ | ||
|
||
where $p(x; \theta)$ is a probability distribution and $f(x; \phi)$ is a function. | ||
{% cite JMLR:v21:19-346 %} | ||
|
||
or the Wave equation | ||
|
||
$$ | ||
\frac{\partial^2 u}{\partial x^2} = c^2 \Delta u | ||
$$ | ||
|
||
where $u$ is a function and $\Delta$ is the Laplacian operator. | ||
And in general, any PDE. | ||
|
||
|
||
|
||
|
||
<!-- what is AD? --> | ||
|
||
The key to automatic differentiation is the chain rule. | ||
It allows us to compute the derivative of a function by breaking it down into a sequence of elementary operations. | ||
Calculating their derivatives independently, we can then combine them to get the derivative of the original function. | ||
|
||
|
||
|
||
|
||
<!-- current state of integration tools --> | ||
|
||
|
||
|
||
## Bibliography | ||
|
||
{% bibliography --cited %} |