forked from barryclark/jekyll-now
-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
494 additions
and
102 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
--- | ||
title: "PITS plus flows" | ||
subtitle: "A method to apply PITS to arbitrary distributions using neural flows" | ||
layout: post | ||
permalink: /pits/flow | ||
scholar: | ||
bibliography: "pits.bib" | ||
--- | ||
|
||
Constriained search in the typical set is intractable for arbitrary distributions. | ||
We propose a method to apply PITS to arbitrary distributions using neural flows. | ||
|
||
|
||
<!-- how easily can we ask. is x in the typical set? --> | ||
|
||
|
||
% TODO in which cases does $x - f^{-1}(\alpha f(x))$ approximate $\nabla_x p_f(x)$?? | ||
|
||
In general, the typical set, $\mathcal T_{p(x)}^{\epsilon}$, is intractable to compute for arbitrary continuous distributions. | ||
However, we assume we have access to a flow that maps from clean data to a Gaussian source distribution, $f_{push}: P(X) \to \mathcal N(Y)$. | ||
|
||
% (needs proof) | ||
We conjecture that is it possible to use a flow to sample from the typical set of arbitrary distributions (see future work \ref{futurePOTS}). | ||
This can be achieved by exploiting the structure of the flow-based models Gaussian source distribution. | ||
|
||
For Gaussian distributions, the typical set has a simple closed-form solution, an annulus, with radius and thickness dependent on the dimension and standard deviation of the Gaussian. | ||
|
||
% (needs proof) | ||
Projection into the typical set for a Gaussian can be approximated via a | ||
|
||
Thus, we implement POTS as: | ||
|
||
\begin{align*} | ||
h = f(y) \tag{forward flow}\\ | ||
\hat h = \text{proj}(h) \tag{project onto typical set}\\ | ||
\hat x = f^{-1}(\hat h) \tag{backward flow} | ||
\end{align*} | ||
|
||
|
||
\begin{figure}[H] | ||
\centering | ||
\includegraphics[width=0.75\textwidth]{assets/pots-diagram.png} | ||
\vspace{-1em} | ||
\caption{A diagram of the POTS method. We start with the clean signal $x$, shown as a blue dot. The clean signal is then corrupted to produce the observed signal $y$, shown as a red dot. Next, we project the corrupted signal into the typical set to produce our denoised signal $\hat x$, shown as a green dot. The typical set is shown as a teal annulus. \label{f.A}} | ||
\end{figure} |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
--- | ||
title: "Diffusion posterior sampling" | ||
subtitle: "A review of recent work" | ||
layout: post | ||
permalink: /pits/review-dps | ||
categories: | ||
- "tutorial" | ||
scholar: | ||
bibliography: "pits.bib" | ||
--- | ||
|
||
(_lit review as of 07/2024_) | ||
|
||
Given a pretrained diffusion model, we seek to generate conditional samples based on an observed signal $y$. | ||
For example, we many be given a noisy image and tasked with denoising it, or a black and white image and tasked with recoloring it. | ||
|
||
One approach seeks to augment the dynamics of the pretrained diffusion model. We call these guided diffusion models, after 'guided diffusion' {% cite ho2022classifierfreediffusionguidance %}. | ||
|
||
Early approaches were rather heuristic, for example; a mask-based approach {% cite lugmayr_repaint_2022 %}, SVD inspired {% cite kawar_denoising_2022 %}, null space projection {% cite wang_zero-shot_2022 %}. | ||
|
||
Next came diffusion posterior sampling (DPS), a more principled approach. It starts by rewriting the diffusion SDE to use the unknown posterior score, $\nabla_x \log p(x \mid y)$, rather than the prior score, $\nabla_x \log p(x)$. | ||
|
||
$$ | ||
\begin{align*} | ||
dx &= \left[ f(x, t) - g(t)^2 \nabla_x \log p_t(x) \right] dt + g(t) dw \tag{unconditional SDE} \\ | ||
dx &= \left[ f(x, t) - g(t)^2 \nabla_x \log p_t(x | y) \right] dt + g(t) dw \tag{conditional SDE} \\ | ||
\end{align*} | ||
$$ | ||
|
||
|
||
This allows us to generate samples from the posterior by solving the conditional SDE. | ||
|
||
But, we don't know the score of the posterior, so we use Bayes' rule to rewrite the posterior score in terms of the likelihood and prior scores. | ||
|
||
$$ | ||
\begin{align*} | ||
p(x | y) &= \frac{p(y | x) p(x)}{p(y)} \\ | ||
\log p(x | y) &= \log p(y | x) + \log p(x) - \log p(y) \\ | ||
\nabla_x \log p(x | y) &= \nabla_x \log p(y | x) + \nabla_x \log p(x) \\ | ||
\end{align*} | ||
$$ | ||
|
||
DPS {% cite chung_diffusion_2023 %}, $\Pi$GDM {% cite song_pseudoinverse-guided_2023 %} and others have shown that it is possible to construct / approximate $\nabla_x \log p_t(x \mid y)$. | ||
Note that $\Pi$GDM has also been applied to flows {% cite pokle_training-free_2024 %}. | ||
|
||
$$ | ||
\begin{align*} | ||
\nabla_x \log p(y | x_t) &\approx \nabla_x \parallel y - C(x) \parallel^2_2 \tag{DPS} \\ | ||
\nabla_x \log p(y | x_t) &\approx (y - H\hat x)^T (r_t^2 H H^T + \sigma^2I)^{-1} H \frac{\partial \hat x_t}{\partial x_t} \tag{$\Pi$GDM} | ||
\end{align*} | ||
$$ | ||
|
||
In parallel, a variational approach frames the conditional generation problem as an optimization problem {% cite benhamu2024dflowdifferentiatingflowscontrolled mardani_variational_2023 mardani_variational_2023-1 %}. | ||
|
||
$$ | ||
\begin{align*} | ||
x^* &= \arg \min_z \nabla_z \parallel y - f(D(z)) \parallel^2_2 \tag{DPS} \\ | ||
\end{align*} | ||
$$ | ||
|
||
Where $D$ is the diffusion model, $f$ is the forward model, $z$ is the latent variable and $y$ is the observed signal. | ||
These variational approaches proidve high quality samples, but are computationally expensive (approx 5-15 minutes for ImageNet-128 with an NVidia V100 GPU). | ||
|
||
|
||
And finally {% cite dou_diffusion_2024 %} present a Bayesian filtering perspective which leads to an algorithms that converges to the true posterior. | ||
|
||
|
||
## Bibliography | ||
{% bibliography --cited %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
--- | ||
title: "Projection into the typical set: PITS" | ||
subtitle: "A new approach to solving inverse problems" | ||
layout: post | ||
permalink: /pits | ||
categories: | ||
- "research" | ||
scholar: | ||
bibliography: "pits.bib" | ||
--- | ||
|
||
|
||
The advances in generative modelling have shown that we can generate high-quality samples from complex distributions. | ||
A next step is to use these generative models as priors to help solve inverse problems. | ||
|
||
Diffusion models {% cite song2021scorebasedgenerativemodelingstochastic %} don't support likelihood estimates, only generating samples. Thus inverse problem solvers revert to sampling from the posterior to generate solutions {% cite chung_diffusion_2023 %}. Though it's best to think of these solutions as proposals, as there is no guarantee on quality or accuracy. | ||
|
||
Neural flows {% cite albergo_stochastic_2023 liu_flow_2022 lipman_flow_2023 %} have recently achieved s.o.t.a {% cite esser2024scalingrectifiedflowtransformers %} and do support likelihood estimates. They can be used to find the local maximum of the posterior {% cite benhamu2024dflowdifferentiatingflowscontrolled %}. However, differentiating through a flow is extremely expensive. | ||
|
||
So, solving inverse problems via a principled approach like MAP is not quite possible with s.o.t.a generative models. | ||
Maybe we can provide a viable alternative. | ||
|
||
*** | ||
|
||
Inverse problems are a class of problems where we want to find the input to a function given the output. For example (within generative machine learning) we care about; | ||
|
||
- image recoloring, where we want to find the original image given the black and white image. | ||
- image inpainting, where we want to find the original image given the image with a hole in it. | ||
- speech enhancement, where we want to find the clean speech given the noisy speech. | ||
|
||
We consider the setting where we have access to a prior $p(x)$ (e.g. normal, clear speech) and likelihood function $p(y \mid x)$ (the environment adding background noise and interference). We observe $y$ and want to recover $x$. | ||
|
||
Using Bayes rule, we can write the posterior and our goal as; | ||
|
||
$$ | ||
\begin{align*} | ||
p(x | y) &= \frac{p(y | x) p(x)}{p(y)} \tag{posterior} \\ | ||
x^* &= \arg \max_x p(x | y) \tag{the MAP solution} | ||
\end{align*} | ||
$$ | ||
|
||
> MAP will return the most likely value of $x$, given $y$. | ||
However, is the most likely value of $x$ the 'best' guess of $x$? | ||
|
||
We offer an alternative approach, suggesting that our guess of $x$ should be typical of the prior. | ||
We write this as; | ||
|
||
$$ | ||
\begin{align*} | ||
x^* &= \arg \max_{x \in \mathcal T(p(x))_\epsilon} p(y | x) \tag{PITS} | ||
\end{align*} | ||
$$ | ||
|
||
where $\mathcal T(p(x))_\epsilon$ is the $\epsilon$-typical set of $p(x)$. Thus we have Projection Into the Typical Set (PITS). | ||
|
||
<!-- Note: This assumes we are working in high enough dimensions that the typical set has concentrated and any sample from the prior is very likely to be typical. --> | ||
|
||
I wrote a few posts to help you understand PITS; | ||
|
||
[1.]({{ site.baseurl }}/pits/typical) Background on typicality \ | ||
[2.]({{ site.baseurl }}/pits/map) A simple worked example showing that MAP produces solutions that are not typical. \ | ||
[3.]({{ site.baseurl }}/pits/non-typical) (WIP) Does it matter if solutions are not typical? \ | ||
[4.]({{ site.baseurl }}/pits/flow) (WIP) A method to apply PITS arbitrary distributions (using neural flows). \ | ||
[5.]({{ site.baseurl }}/pits/flow-theory) (WIP) Theory showing that in the Gaussian case, PITS combined with flows is principled. \ | ||
[6.]({{ site.baseurl }}/pits/mnist-demo) (WIP) A demonstration of the PITS approach to inverse problems applied to neural flows. \ | ||
[7.]({{ site.baseurl }}/pits/review-dps) A brief review of methods attempting to solve inverse problems using s.o.t.a generative models. | ||
|
||
|
||
<!-- A main advantage of the PITS approach is that it provides a way to control the quality (/typicality) of the solutions. --> | ||
|
||
<!-- what if the true x is not typical? --> | ||
<!-- why not find the MAP solution and then project it into the typical set? --> | ||
|
||
<!-- why is it a problem if my diffusion model produces samples that are not typical? --> | ||
|
||
## Bibliography | ||
|
||
{% bibliography --cited %} | ||
|
||
*** | ||
|
||
These ideas were generated while studying at [VUW](https://www.wgtn.ac.nz/) with [Bastiaan Kleijn](https://people.wgtn.ac.nz/bastiaan.kleijn) and [Marcus Frean](https://people.wgtn.ac.nz/marcus.frean). I was funded by [GN](https://www.gn.com/). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.