- What is score-based generative modeling?
- What is a Schrödinger Bridge?
- Two dimensional examples
- MNIST dataset
- CelebA exploration
- Dataset interpolation

Let us recall some of the basics concepts of score-based generative modeling, see also Yang Song blogpost for an introduction or the original papers [2,3]. If we have access to a (Ornstein-Ulhenbeck) diffusion $$ \mathrm{d} \mathbf{X}_t = -\alpha \mathbf{X}_t \mathrm{d} t + \sqrt{2} \mathrm{d} \mathbf{B}_t \ , \qquad \mathbf{X}_0 \sim p_{\mathrm{data}} \ ,$$ with $\alpha > 0$. The process $(\mathbf{X}_t)_{t \in [0,T]}$ is interpreted as a

Image extracted
from Yang Song
blog. Generative model for CelebA.

In the next paragraph we introduce our main contribution:

The Schrödinger Bridge (SB) problem is a classical problem appearing in applied mathematics, optimal control and probability; see [5, 6, 7]. In the discrete-time setting, it takes the following (dynamic) form. Consider as reference diffusion $(\mathbf{X}_t)_{t \in [0,T]}$ with distribution $\mathbb{P}$ describing the process adding noise to the data. We aim to find $\pi^\star$ such that $\pi^\star_0 = p_{\mathrm{data}}$ and $\pi^\star_T = p_{\mathrm{prior}}$ and minimize the Kullback-Leibler divergence between $\pi^\star$ and $\mathbb{P}$. More precisely $$ \pi^\star = \mathrm{argmin} \ \{\mathrm{KL}(\pi|\mathbb{P})\ , \ \pi_0 = p_{\mathrm{data}} \ , \ \pi_N = p_{\mathrm{prior}}\}$$ In this work we introduce

We know show how this problem is related to score-based generative modeling. Assume that $\pi^{2n}$ is the measure associated with the diffusion $$ \mathrm{d} \mathbf{X}_t^n = f_t^n(\mathbf{X}_t^n) \mathrm{d} t + \sqrt{2} \mathrm{d} \mathbf{B}_t \ , \quad \mathbf{X}_0^n \sim p_{\mathrm{data}} \ , $$ then we show that $(\pi^{2n+1})^R$ (where $R$ denotes the time-reversal operation) is associated with the diffusion $$ \mathrm{d} \mathbf{Y}_t^n = b_{T-t}^n(\mathbf{Y}_t^n) \mathrm{d} t + \sqrt{2} \mathrm{d} \mathbf{B}_t \ , \quad \mathbf{Y}_0^n \sim p_{\mathrm{prior}} $$ where $$ b_{t}^n(x) = -f_t^n(x) + 2 \nabla \log p_t^n(x) \ , $$ with $p_t^n$ the density of $\pi_t^{2n}$. Repeating this procedure we obtain that $\pi^{2n+2}$ is associated with the diffusion $$ \mathrm{d} \mathbf{X}_t^{n+1} = f_t^{n+1}(\mathbf{X}_t^{n+1}) \mathrm{d} t + \sqrt{2} \mathrm{d} \mathbf{B}_t \ , \quad \mathbf{X}_0^{n+1} \sim p_{\mathrm{data}} \ , $$ where $$ f_{t}^{n+1}(x) = -b_t^n(x) + 2 \nabla \log q_t^n(x) \ , $$ with $q_t^n$ the density of $\pi_t^{2n+1}$. We can then iterate this procedure. Of course we can not sample from these dynamics directly and we discretize them using Euler-Maruyama approximation. The logarithmic gradients are then approximated using score-matching techniques (although for memory reasons we do not approximate the scores but the

- Contrary to existing score-based generative modeling methods which require the reference process to converge to $p_{\mathrm{prior}}$ the convergence of our algorithm is determined by the convergence of the IPF.
- The DSB algorithm can be used on top of existing algorithms. Hence, our method can be seen as a refinement of original score-based generative models and all techniques used to improve the quality/speed of theses methods can be implemented for DSB.
- DSB does not require $p_{\mathrm{prior}}$ to be Gaussian. In fact we only require having access to samples from $p_{\mathrm{prior}}$ which can be another dataset. In particular we are able to perform dataset interpolation. This paves the way for further applications for high dimensional optimal transport.

- DSB does not achieve state-of-the-art generative modeling results (yet) due to compute limitations as we use existing architectures in order to parameterize our score approximations. These architectures are deep and notably instable (for instance they require the use of exponential moving average). This requires careful selection of the parameters of DSB.

In the next sections we show some of our results on two dimensional examples, MNIST, CelebA and dataset interpolation.

For each of the row, we show the target density on the left and an animated plot of the DSB iterations on the right. Here the prior density $p_{\mathrm{prior}}$ is given by a Gaussian density with zero mean and covariance matrix $\sigma \mathrm{Id}$ where $\sigma$ is the variance computed from the target dataset. We fix the number of stepsizes to $20$.

First we show some of samples obtained with our DSB algorithm (obtained with 30 steps in the backward dynamics).

Original dataset (left) and generated samples (right).

In our work we apply DSB for the generation of CelebA.

Here we show some latent space exploration. The Gaussian random variables in the generative models are fixed and therefore the transformation is deterministic. In this animation we follow an Ornstein-Ulhenbeck process into the latent space and observe its transformation by this deterministic mapping in the image space.

Finally, we present some dataset interpolations experiments in two dimensions,

[1] Prafulla Dhariwal and Alex Nichol

[2] Yang Song and Stefano Ermon

[3] Jonathan Ho, Ajay Jain and Pieter Abbeel

[4] Hans Föllmer

[5] Christian Léonard

[6] Yongxin Chen, Tryphon Georgiou and Michele Pavon

[7] Aapo Hyvärinen and Peter Dayan

[8] Gabriel Peyre and Marco Cuturi