# Schedule for: 21w5009 - Statistical Aspects of Non-Linear Inverse Problems (Online)

Beginning on Sunday, October 31 and ending Friday November 5, 2021

All times in Banff, Alberta time, MDT (UTC-6).

Sunday, October 31 | |
---|---|

09:00 - 10:00 | place holder (Online) |

Monday, November 1 | |
---|---|

09:15 - 09:30 |
Introduction and Welcome by BIRS Staff ↓ A brief introduction to BIRS with important logistical information, technology instruction, and opportunity for participants to ask questions. (Zoom) |

09:30 - 10:10 |
Richard Nickl: Bayesian non-linear inversion: progress and challenges ↓ Solving non-linear inverse problems in a modern `data-science’ framework requires a statistical formulation of the measurement and error process. Since seminal work of Andrew Stuart (2010), the Bayesian approach has become a popular computational and inferential tool in this context, and more recently also a theoretical understanding of the performance of these methods has been developed. We review recent mathematical progress in this field and formulate analytical properties that may render inverse problems provably `solvable’ by Bayesian algorithms. This leads on to many open problems both in the area of PDEs and inverse problems and in Bayesian nonparametric statistics, and we will describe some of those if time permits. (Zoom) |

10:20 - 11:00 |
Mikko Salo: Instability mechanisms in inverse problems ↓ Many inverse and imaging problems, such as image deblurring or electrical/optical tomography, are known to be highly sensitive to noise. In these problems small errors in measurements may lead to large errors in reconstructions. Such problems are called ill-posed or unstable, as opposed to being well-posed (a notion introduced by J. Hadamard in 1902). Instability also affects the performance of statistical algorithms for solving inverse problems.
The inherent reason for instability is easy to understand in linear inverse problems like image deblurring. For more complicated nonlinear imaging problems the instability issue is more delicate. We will discuss a general framework for understanding instability in inverse problems based on smoothing/compression properties of the forward map together with estimates for entropy and capacity numbers in relevant function spaces. The methods apply to various inverse problems involving general geometries and low regularity coefficients.
This talk is based on joint work with Herbert Koch (Bonn) and Angkana Rüland (Heidelberg). (Zoom) |

11:05 - 11:10 |
Group Photo ↓ Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo! (Zoom) |

11:30 - 12:10 |
Bamdad Hosseini: Solving and Learning Nonlinear PDEs with Gaussian Processes ↓ In this talk I present a simple, rigorous, and interpretable framework for solution of nonlinear PDEs based on the framework of Gaussian Processes. The proposed approach provides a natural generalization of kernel methods to nonlinear PDEs; has guaranteed convergence; and inherits the state-of-the-art computational complexity of linear solvers for dense kernel matrices. I will outline our approach by focusing on an example nonlinear elliptic PDE followed by further numerical examples.
I will also briefly comment on extending our approach to solving inverse problems. (Zoom) |

Tuesday, November 2 | |
---|---|

09:30 - 10:10 |
Christoph Schwab: Multilevel approximation of Gaussian random fields: Covariance compression, estimation and spatial prediction ↓ Joint with
H. Harbrecht (Basle, Switzerland)
L. Herrmann (RICAM, Linz, Austria)
K. Kirchner (TU Delft, The Netherlands)
Centered Gaussian random fields (GRFs) indexed by compacta such as smooth, bounded domains in Euclidean space or smooth, compact and orientable manifolds are determined by their covariance operators. We analyze the numerical approximation of centered GRFs given sample-wise as variational solutions to \emph{coloring} operator equations
driven by spatial white noise. Admissible coloring operators are elliptic, self-adjoint from the H\"ormander class comprising the Mat\'{e}rn covariances. Precision and covariance operators can be represented via multiresolutions as bi-infinite matrices. Finite sections may be diagonally preconditioned rendering the condition number independent of
the dimension p of this section.
Tapering by thresholding as in [Bickel, P.J.\ and Levina, E. , Covariance regularization by thresholding, Ann. Statist., 36 (2008), 2577--2604] applied on finite sections of the bi-infinite precision and covariance matrices results in \emph{optimally numerically sparse} approximations.
``Numerical sparsity'' signifies that, asymptotically, a number of nonzero matrix entries that grows linearly with the number p of GRF parameters.
The tapering strategy is non-adaptive and locations of these nonzero matrix entries are known a priori.
Analysis of the relative size of the entries of the tapered covariance matrices motivates \emph{novel, multilevel Monte Carlo (MLMC) oracles for covariance estimation}, in sample complexity that scales log-linearly with respect to the number p of parameters, extending [Bickel, P.J.\ and Levina, E. Regularized Estimation of Large Covariance Matrices, Ann.\ Stat., 36 (2008), pp.\ 199--227] to estimation of (finite sections of) pseudodifferential covariances for GRFs by this fast MLMC method.
Preprint: https://math.ethz.ch/sam/research/reports.html?id=951 (Zoom) |

10:20 - 11:00 |
Barbara Kaltenbacher: Reduced, all-at-once, and variational formulations of inverse problems and their solution ↓ The conventional way of formulating inverse problems such as identification of a (possibly infinite dimensional) parameter, is via some forward operator, which is the concatenation of the observation operator with the parameter-to-state-map for the underlying model.
Recently, all-at-once formulations have been considered as an alternative to this reduced formulation, avoiding the use of a parameter-to-state map, which would sometimes lead to too restrictive conditions. Here the model and the observation are considered simultaneously as one large system with the state and the parameter as unknowns.
A still more general formulation of inverse problems, containing both the reduced and the all-at-once formulation, but also the well-known and highly versatile so-called variational approach (not to be mistaken with variational regularization) as special cases, is to formulate the inverse problem as a minimization problem (instead of an equation) for the state and parameter. Regularization can be incorporated via imposing constraints and/or adding regularization terms to the objective.
In this talk we will provide some new application examples of minimization based formulations, such as impedance tomography with the complete electrode model. Moreover, we will consider iterative regularization methods resulting from the application of gradient or Newton type iterations to such minimization based formulations and provide convergence results. (Zoom) |

11:30 - 12:10 |
Youssef Marzouk: Dimension reduction in nonlinear Bayesian inverse problems ↓ In many inverse problems, data may inform only a relatively low-dimensional subspace of the parameter space. In the Bayesian setting, this intuition corresponds to a "low-dimensional update” from prior to posterior. Characterizing this update can help to accelerate MCMC sampling of the exact posterior, or to construct more tractable posterior approximations that are useful in their own right.
To this end, I will discuss a dimension reduction technique for Bayesian inverse problems with nonlinear forward operators, certain non-Gaussian priors, and non-Gaussian observation noise. The likelihood function is approximated by a ridge function, i.e., a map which depends non-trivially only on a few linear combinations of the parameters. We build this ridge approximation by minimizing an upper bound on the Kullback–Leibler divergence between the posterior distribution and its approximation. This bound, obtained via logarithmic Sobolev inequalities, allows one to control the error of the posterior approximation. We will demonstrate iterative algorithms for constructing these approximations and compare the results with other dimension reduction approaches.
Time permitting, I will also describe how this approach to dimension reduction enables efficient variational approximations of the posterior distribution as the pushforward of a simple/tractable reference distribution by a deterministic map, by enforcing structure on the associated transport map or flows.
This is joint work with O. Zahm, T. Cui, K. Law, and A. Spantini. (Zoom) |

Wednesday, November 3 | |
---|---|

09:30 - 10:10 |
Robert Scheichl: Efficient Sample-Based Inference Algorithms in High Dimensions ↓ General multivariate distributions are notoriously difficult to sample from, particularly the high-dimensional posterior distributions in PDE-constrained inverse problems. In this talk, I present a sampler for arbitrary continuous multivariate distributions based on low-rank surrogates in the tensor-train (TT) format, a methodology for scalable, high-dimensional function approximation from computational physics and chemistry. Building upon cross approximation algorithms in linear algebra, we construct a TT approximation to the target probability density function using only a small number of function evaluations. For sufficiently smooth distributions, the storage required for accurate TT approximations is moderate, scaling linearly with dimension. In turn, the structure of the tensor-train surrogate allows sampling by an efficient conditional distribution method, since marginal distributions are computable with linear complexity in dimension. I will also highlight the link to normalizing flows in machine learning and to transport-based variational inference algorithms for high-dimensional distributions. Finally, I will mention extensions suitable for more strongly concentrating posterior distributions using a multi-layered approach: the Deep Inverse Rosenblatt Transport (DIRT) algorithm proposed by Cui and Dolgov in a recent preprint. This talk is based on joint work with Karim-Anaya Izquierdo (Bath), Tiangang Cui (Monash), Sergey Dolgov (Bath) and Colin Fox (Otago). (Zoom) |

10:20 - 11:00 |
Judith Rousseau: On some Bayesian inverse problems in mixture models ↓ Joint works with C. Scricciolo (Univ. Verona) and Dan Moss (Univ . Oxford)
In this work I will discuss two families of statistical inverse problems in mixture models, using Bayesian approaches. The first concerns the reknown deconvolution model, where the distribution of the noise is known and the second is the nonparametric Hidden Markov model with finite state space. The deconvolution problem consists in observations $Y_i = X_i + \epsilon_i$, $i\le n$, where $\epsilon_i$ is the noise and has known distribution $F_\epsilon$ and $X_i$ is the signal whose distribution $F_X$ is unknown . We study Bayesian nonparametric methods for in such models and provide simple conditions to derive concentration of the posterior distribution in terms of the Wasserstein distance for the unknown distribution of the signal $F_X$ . To do so we derive an inversion inequality which allows to compare some distance in the direct problem with the Wasserstein distance in the indirect problem.
The second problem concerns nonparametric hidden Markov models where one observes $Y_i$, $i\le n$ whose distribution verifies: $Y_i | X_i = j \sim F_j$ , for $j \le J$ and the latent variables $(X_j)_j$ form a Markov chain on $\{1, \cdots ,J\}$ with transition matrix $Q$. We will discuss estimation aspects in this model, in particular how to recover precisely the transition matrix $Q$ while jointly estimating correctly the unknown emission distributions $F_1 , \cdots , F_J$ . The latter is obtained by deriving an inversion inequality from the $L_1$ distance between densities in the direct problem to the $L_1$ distance between the emission distributions. (Zoom) |

11:30 - 12:10 |
Plamen Stefanov: Noise in linear inverse problems ↓ We study the effect of additive noise to the inversion of FIOs associated to a diffeomorphic canonical relation. We use the microlocal defect measures to measure the power spectrum of the noise in the phase space and analyze how that power spectrum is transformed under the inversion. In general, white noise, for example, is mapped to noise depending on the position and on the direction.
In particular, we compute the standard deviation, locally, of the noise added to the inversion as a function of the standard deviation of the noise added to the data. As an example, we study the Radon transform in the plane in parallel and fan-beam coordinates, and present numerical examples. (Zoom) |

Thursday, November 4 | |
---|---|

09:30 - 10:10 |
Samuli Siltanen: Bayesian inversion for Glottal Inverse Filtering ↓ Human speech is a sophisticated means of communication and plays an unparalleled role in today’s society. Whether developing the latest voice recognition software for a smartphone or designing computers to aid people who have lost their voice through disease and illness, researchers are finding ways to map the precise mechanisms in the human vocal tract. (Zoom) It is shown how the human vocal folds and the mouth and lips combine to create the vowel sound. Furthermore, it is explained how a technique called glottal inverse filtering (GIF) can be used to determine the exact mechanisms behind vowel sounds from microphone recordings. The main use of improved GIF is to provide disabled women and children with better computer-based speech prostheses. The higher fundamental frequency of women’s and children’s voices makes them more difficult for speech synthesis software. Using Bayesian inversion and a Markov chain Monte Carlo (MCMC) method, we can significantly improve GIF results over traditional engineering approaches for challenging speech signals. GIF is a moderately nonlinear inverse problem with a variety of computational models available, both continuous and discrete. Therefore, it is a good model problem for developing the mathematics of Bayesian inversion methodology. |

10:20 - 11:00 |
Giovanni S. Alberti: Infinite-dimensional inverse problems with finite measurements ↓ In this talk I will discuss uniqueness, stability and reconstruction for infinite-dimensional nonlinear inverse problems with finite measurements, under the a priori assumption that the unknown lies in, or is well-approximated by, a finite-dimensional subspace or submanifold. The methods are based on the interplay of applied harmonic analysis, in particular sampling theory and compressed sensing, and the theory of inverse problems for partial differential equations. Several examples, including the Calderón problem and scattering, will be discussed. (Zoom) |

11:30 - 12:10 |
Nathan Glatt-Holtz: Some Recent Developments in the Bayesian Approach to PDE Inverse Problems: Statistical Sampling and Consistency ↓ This talk concerns some of our recent work on a statistical methodology to quantify an unknown, infinite dimensional, parameter $\mathbf{u}$ specifying a class of partial differential equations whose solutions we observe in a limited fashion and which is subject to measurement error. As a paradigmatic model problem we consider the estimation of a divergence free flow field $\mathbf{u}$ from the partial and noisy observation of a scalar $\theta$ which is advected by $\mathbf{u}$ and which diffuses passively in the fluid medium. Thus we suppose that $\theta$ solves
\[ \partial_t \theta + \mathbf{u} \cdot \nabla \theta = \kappa \Delta \theta, \quad \theta(0) = \theta_0, \]
up to the unknown $\mathbf{u}$ and where $\kappa > 0$ is the diffusivity parameter. For this problem, as in a variety of other PDE inverse problems of interest, our task is thus to recover $\mathbf{u}$ from a data set $Y \in \mathbb{R}^n$ which obeys a statistical measurement model of the form
\[ Y = \mathcal{G}(\mathbf{u}) + \eta. \]
Here $\mathcal{G}$ is nonlinear and defined as a composition of a parameter to PDE solution map $\mathcal{S}$ and an observation operator $\mathcal{O}$. The term $\eta$ represents an additive observational error.
A Bayesian approach pioneered recently by Andrew Stuart and others allows for the effective treatment of infinite-dimensional unknowns $\mathbf{u}$ via a suitable regularization at small scales through the consideration of certain classes of Gaussian priors. In this framing, by positing such prior distributions $\mu_0$ on $\mathbf{u}$, and
assuming $\eta$ has the pdf $p_\eta$ we obtain a posterior distribution of the form
\[ \mu(d \mathbf{u}) \propto p_\eta(Y - \mathcal{G}(\mathbf{u})) \mu_0(d\mathbf{u}). \]
This measure $\mu$ therefore provides a comprehensive model of the uncertainty in our unknown $\mathbf{u}$.
In this talk we will survey some recent results which analyzes such PDE constrained posterior measures $\mu$ both analytically and numerically. We will discuss the issue of posterior contraction (consistency) in the large observation limit for the inverse problem defined above. We also describe some infinite dimensional Markov Chain Monte Carlo (MCMC) algorithms which we have developed, refined and rigorously analyzed to effectively sample from $\mu$.
This is joint work with Jeff Borggaard (Virginia Tech), Justin Krometis (Virginia Tech) and Cecilia Mondaini (Drexel). (Zoom) |

Friday, November 5 | |
---|---|

09:30 - 10:10 |
Jan Bohr: Stability & Range of some nonlinear X-ray transforms ↓ We consider a class of nonlinear X-ray transforms, encompassing e.g., Polarimetric Neutron Tomography (PNT). We describe a number of recent analytical results, with a view on their statistical applications. The main focus of the talk will lie on a novel characterisation of the range of the forward map in PNT, which is in the spirit of Pestov-Uhlmann's characterisation for the linear X-ray transform. (Based on joint work with Richard Nickl and Gabriel Paternain) (Zoom) |

10:20 - 11:00 |
Hanne Kekkonen: Consistency of Bayesian inference for a parabolic inverse problem ↓ Bayesian methods for inverse problems have become increasingly popular in applied mathematics in the last decades but the theoretical understanding of the statistical performance of these methods for non-linear inverse problems is still developing.
In this talk I will establish posterior contraction rates for a non-linear parabolic inverse problem with rescaled Gaussian process priors. More precisely, the inverse problem of discovering the absorption term $f > 0$ in a heat equation, with given boundary and initial value functions, from $N$ discrete noisy point evaluations of the forward solution is considered. I will also show that the optimal minimax rate can be achieved with truncated Gaussian priors. (Zoom) |

11:30 - 12:10 |
Sven Wang: On polynomial-time computation of high-dimensional posterior measures by Langevin-type algorithms ↓ In this talk, we consider the problem of sampling from high-dimensional posterior distributions. The main results consist of non-asymptotic computational guarantees for Langevin-type MCMC algorithms which scale polynomially in key quantities such as the dimension of the model and the number of available statistical measurements. As a direct consequence, it is shown that posterior mean vectors as well as optimisation based maximum a posteriori (MAP) estimates are computable in polynomial time, with high probability under the distribution of the data. These results are complemented by statistical guarantees for recovery of the data-generating ground truth parameter. Our results are derived in a general high-dimensional non-linear regression setting where posterior measures are not necessarily log-concave, employing a set of local ‘geometric’ assumptions on the parameter space. The theory is applied to a representative non-linear example from PDEs involving a steady-state Schrödinger equation. (Zoom) |