# Mathematical Image Analysis and Processing (04w5512)

Arriving in Banff, Alberta Saturday, October 23 and departing Thursday October 28, 2004

## Organizers

Selim Esedoglu (University of Michigan)

Sung Ha Kang (University of Kentucky)

Mary Pugh (University of Toronto)

Jackie Shen (University of Minnesota)

## Objectives

Image processing is a rapidly growing field. As such, it is replete with many different approaches and views for addressing similar questions. Our goal is to concentrate on a few of its mathematically intriguing problems, structure our workshop around them, and allow researchers pursuing different mathematical avenues toward the solution of these problems to present the state-of-the-art in their approach. This way, we intend to serve two purposes. First, we would like to provide a forum where results from different approaches can be compared, and perhaps ways of combining the best techniques discovered. Second, by making sure that a wide variety of mathematics gets represented, we believe the workshop would be in a better position to lure more mathematicians into thinking about imaging and vision problems.

We propose our scientific structure to be based on an intrinsic principle of mathematics. That is, {\em from image analysis to image processing}. Image analysis includes modeling and analysis of original image itself, i.e. from image space analysis to different ways to represent image. We like to incorporate any tools of image analysis, including spectral analysis, wavelet, statistics, level-sets and PDEs. On the other hand, image processing is to modify the original image to improve the quality or extracting information from the given image, for example, image restoration, compression, segmentation, shape and texture analysis.

Three crucial ingredients of image processing are the modeling, analysis, and efficient implementation of processing tools. Moreover, designing a successful processing technique invariably relies on having a successful model for images themselves. Mathematicians can contribute to the solution of each one of these issues, and therefore the workshop will concern itself with all of them.

\subsection{Image analysis}

A fundamental issue faced in the design of image analysis techniques is the identification and characterization of the image space. In plain words, one ends up having first to answer the question "What do we mean mathematically by images?" A pertinent example of this point can be found in the variational approach to image de-noising, the goal of which is to estimate the original version of an image from a given degraded one. The variational approach to this problem seeks to exhibit the "restored" image as the minimizer of a functional defined over the space of all images. The first task is clearly to decide which space of functions to take images from. For instance, it is easy to see why Sobolev spaces are ill suited for this purpose: their elements cannot have discontinuities across co-dimension one surfaces, and yet any successful model of images should allow for them. Such discontinuities need to be allowed because one of the most important features of images, namely "edges" (places where one object in the scene ends and another begins) correspond squarely to this type of behavior. The space of functions of bounded variation therefore provides the more appropriate setting. (Another example comes from the variational approach to the image segmentation problem, where the correct space of functions for minimizing one of the most successful models in this context, the Mumford-Shah segmentation model, turns out to be a subset of functions of bounded variation, known as special functions of bounded variation).

We roughly divide the various approaches to image analysis into three categories : (A) Statistical representations (B) Spectral and wavelet representations, and (C) Scale-space representations.

Statistical approaches treat images as samples from random fields, which are often modeled by Markov/Gibbs fields or via statistical learning from an image database. This approach was pioneered in the 80's by Grenander (Brown) and the Gemans (Brown and John Hopkins). It is currently being developed by S. Geman and Mumford (Brown), Simoncelli (NYU), Yuille (Smith-Kettlewell), Zhu (UCLA), and others.

Spectral and wavelet representations are the mathematical foundation for JPEG Internet image coding protocols. It has uses beyond image compression. Psychologists and physiologists studying mammalian vision have found that images are represented via neurons that have small receptive fields --- each neuron is blind to most of a presented scene, responding to only a small patch of it. Further, they find that aspects of these neurons are well-approximated by Gabor waves and wavelets. This approach is a fertile area for applied harmonic analysis. Research in this area involves Coifman (Yale), Daubechies (Princeton), Donoho (Stanford), Mallat (NYU), and others. Scale-space decomposition is a PDE approach to the multi-scale analysis of images. This theory is rooted in the linear filtering (or heat diffusion) process, and was highlighted by the famous Perona-Malik nonlinear model for removing small-scale noise from images without losing large-scale structures like edges. It was later axiomatized and developed by P.-L. Lions, Morel, and others.

\caption{An example of using nonlinear PDE model for image denoising. (a) is original noisy image and (b) is denoised image using Chromaticity, non-flat feature.}

There are other novel theories and trends for image analysis, including regular-space representations by the BV space, free boundary models, and shape models.

Through this workshop, we intend to summarize different mathematical tools for image analysis, to encourage their interaction, or unification, as well as to understand the limitations of the existing approaches.

\subsection{Image processors}

Given a task, how would you design an image processor that performs efficiently and well? Typical tasks would be: denoising, deblurring, edge detection, intensity enhancement, inpainting and interpolation, and compression and decompression. In addition to these these relatively low-level tasks, there are mid- and high-level tasks like disocclusion, shape from shading, motion analysis, image segmentation, and pattern identification and recognition.

In fields like computer vision and medical imaging, there is great demand for efficient and accurate image processors. In biological vision, the plausibility of the processor is also important --can you build the processor in a manner consistent with known biology? Also, what is a good biological measure of the goodness of your processor?

The first step of processing is to construct a suitable models for the given task. Currently, processing models are being developed using tools like Bayesian decision, variational optimization, inverse problems and the like. These approaches arise from fields such as statistical mechanics, the calculus of variations, nonlinear PDEs, differential geometry and topology, harmonic and functional analysis, and numerical analysis. For example, numerous methods has been proposed for image denoising problem, from using transformations and statistical methods to using PDEs. For transformations, typically

Fourier or Wavelet transformation are used and recently Curvlet or Ridgelet transformation is proposed by Candes and Donoho. On the other hand, one of successful example of the variational and PDE method is the {\em Total Variation}(TV) minimization suggested by Rudin, Osher and Fatemi. These denoising methods are also extended to non-flat features as in researches by Tang, Sapiro and Caselles, Chan and Shen, etc. Another topic in Image reconstruction is inpainting problem. The Digital inpainting problem, proposed by Bertalmio,Sapiro, Caselles and Ballester, is also related to disocclusion problem from vision analysis by Masnou and Morel's level line based disocclusion. Closely related approach using PDE method is TV minimizing inpainting suggested by Chan and Shen. As another example, there are various approaches for the image segmentation problem : using snake methods by Kass, Witkin and Terzopoulos or region growing and merging techniques or global optimization approaches using energy functional or Bayesian approaches.

\caption{An example of image reconstruction tesk, in particular, image inpainting techniques. (a) is original image with yellow letters to be removed and (b) is inpainted, restored image without the letters.}

Once one has a processing model, the next step is its analysis, in order to answer questions of existence and uniqueness, stability, properties of solutions, and so on. Many image processing model are nonlinear or non-convex --- analyzing them requires new mathematical insights. The mathematical reader will not be surprised to hear that relatively few image processing models have been analyzed in this way. There is a vast engineering literature of image processors, which were mostly developed based on empirical insights and leaps of good intuition. They await rigorous mathematical analysis that can be crucial for answering questions of great practical significance, such as which among the many techniques proposed for the same task is superior, what effect the various parameters that appear in a method have on its behavior, and under what conditions a given technique can be expected to perform well. A good example in this context is the work of Lions and his collaborators on the analysis of the Perona-Malik scheme that we briefly mentioned above. This scheme appears to be the finite differences discretization of a nonlinear PDE that has no correct theory (of well-posedness). It was well known, prior to the work of Lions et. al. that, despite its success at its intended purpose the scheme is very sensitive to the presence of noise and the choice of parameters such as the resolution of the digital image -- a fact intimately connected with the lack of a continuum (PDE) theory. The work of Lions etc. replaced the Perona-Malik scheme with one that has all the desirable characteristics of the original, as well as a rigorously established continuum limit. The practical implication is much more stable behavior with respect to the presence of noise and different resolutions.

Finally, what is an efficient algorithm for implementing the processing model? One can certainly develop and implement schemes without performing the type of mathematical analyses described above. But one runs the risk of developing an algorithm that performs poorly in the presence of small perturbations like noise. Also, the history of scientific computing shows that the breakthroughs that led to massive speed-up would have been impossible without a deep understanding of the mathematics. One wants to turn mathematical intellectual power into efficient and robust software. The methods currently being used in image processing come from almost all branches of scientific computing including fast fourier and wavelet transforms, multigrid methods, dynamic programming, combinatorial optimization, computational PDEs, numerical linear algebra, and Monte-Carlo simulations.

We can use the important problem of image segmentation to illustrate how mathematics has contributed to image processing in the three stages we identified above: modeling, analysis, and implementation. The goal of segmentation is to divide up the image domain $\Omega$ into as few and simple pieces as possible, so that image features (such as gray-scale intensity, or color) are approximately constant or slowly varying in each piece. This procedure helps differentiate between parts of the image domain occupied by distinct objects in the scene. It is a challenging problem that is intimately connected with edge detection.

The modeling aspect of this problem saw many fundamental contributions from mathematicians. The outstanding example is the work of D. Mumford and J. Shah, who proposed a variational approach for its solution -- in their technique the segmentation is obtained by finding the minimizer of an energy. More specifically, given an original image $u$ (modeled here as an $L^2$ function), the Mumford-Shah segmentation model calls for the minimization of the following energy:

\begin{equation*} MS(v;K) := \int_{\Omega\setminus K} |\nabla v|^2 \, dx + H^{n-1}(K) + \int_\Omeg a (u-v)^2 \, dx \end{equation*} over functions $v$, and closed subsets $K$ of $\Omega$ across which $v$ is allowed to be discontinuous. Here, $\Omega$ is an $n$-dimensional domain, and $H^{n-1}$ denotes the $(n-1)$-dimensional surface measure. Although there were already variational models of segmentation before the work of Mumford and Shah (namely, Blake and Zisserman's energies), they were all in the discrete setting -- their continuum analogues were not clear.

\caption{An example of image segmentation, in particular, texture segmentation techniques. (a) original image with zebras and (b) segmented image.}

This is a non-typical variational problem, whose analysis led to a wealth of new mathematics. The existence of minimizers for problems of this type, now known as free discontinuity problems, have been obtained by E. De Giorgi, M. Carriero, A. Leaci, L. Ambrosio, and others. The work of A. Chambolle established rigorously the connection between Mumford and Shah's model, and the discrete energies of Blake and Zisserman. Research in this new topic of calculus of variations is still very active. Among recent results, the work of G. Alberti, G. Bouchitte, and G. Dal Maso identify sufficient conditions for minimizers of $MS$. It is worth mentioning that the analytical tools developed in this context can find use in applications other than image processing, since $MS$-like energies have been used to model cracks in solids etc.

Numerical implementation of the Mumford-Shah model has also been a subject of intense mathematical research. As it is written down, the energy $MS$ is very difficult to handle: it requires minimization over subsets $K$ of the domain $\Omega$. The work of L. Ambrosio and V. M. Tortorelli has rigorously shown how to approximate it in the sense of Gamma convergence by elliptic functionals (whose numerical treatment is still not trivial, yet a lot more approachable). In a different vein, the work of T. Chan and L. Vese has shown how the level set method of S. Osher and J. Sethian can be utilized, very effectively, in the minimization of these types of energies.

Some workshop topics

1. Shape Analysis and Reconstruction : from shading, motion, and other visual cues, shape analysis/edge detection/shape from shading

2. Statistical methods including texture analysis/segmentation

3. Image Reconstruction : denoising, deblurring, and inpainting, using any tools from spectral analysis, wavelets, and PDEs.

4. Level set PDE methods and numerical analysis of image processing}

We propose our scientific structure to be based on an intrinsic principle of mathematics. That is, {\em from image analysis to image processing}. Image analysis includes modeling and analysis of original image itself, i.e. from image space analysis to different ways to represent image. We like to incorporate any tools of image analysis, including spectral analysis, wavelet, statistics, level-sets and PDEs. On the other hand, image processing is to modify the original image to improve the quality or extracting information from the given image, for example, image restoration, compression, segmentation, shape and texture analysis.

Three crucial ingredients of image processing are the modeling, analysis, and efficient implementation of processing tools. Moreover, designing a successful processing technique invariably relies on having a successful model for images themselves. Mathematicians can contribute to the solution of each one of these issues, and therefore the workshop will concern itself with all of them.

\subsection{Image analysis}

A fundamental issue faced in the design of image analysis techniques is the identification and characterization of the image space. In plain words, one ends up having first to answer the question "What do we mean mathematically by images?" A pertinent example of this point can be found in the variational approach to image de-noising, the goal of which is to estimate the original version of an image from a given degraded one. The variational approach to this problem seeks to exhibit the "restored" image as the minimizer of a functional defined over the space of all images. The first task is clearly to decide which space of functions to take images from. For instance, it is easy to see why Sobolev spaces are ill suited for this purpose: their elements cannot have discontinuities across co-dimension one surfaces, and yet any successful model of images should allow for them. Such discontinuities need to be allowed because one of the most important features of images, namely "edges" (places where one object in the scene ends and another begins) correspond squarely to this type of behavior. The space of functions of bounded variation therefore provides the more appropriate setting. (Another example comes from the variational approach to the image segmentation problem, where the correct space of functions for minimizing one of the most successful models in this context, the Mumford-Shah segmentation model, turns out to be a subset of functions of bounded variation, known as special functions of bounded variation).

We roughly divide the various approaches to image analysis into three categories : (A) Statistical representations (B) Spectral and wavelet representations, and (C) Scale-space representations.

Statistical approaches treat images as samples from random fields, which are often modeled by Markov/Gibbs fields or via statistical learning from an image database. This approach was pioneered in the 80's by Grenander (Brown) and the Gemans (Brown and John Hopkins). It is currently being developed by S. Geman and Mumford (Brown), Simoncelli (NYU), Yuille (Smith-Kettlewell), Zhu (UCLA), and others.

Spectral and wavelet representations are the mathematical foundation for JPEG Internet image coding protocols. It has uses beyond image compression. Psychologists and physiologists studying mammalian vision have found that images are represented via neurons that have small receptive fields --- each neuron is blind to most of a presented scene, responding to only a small patch of it. Further, they find that aspects of these neurons are well-approximated by Gabor waves and wavelets. This approach is a fertile area for applied harmonic analysis. Research in this area involves Coifman (Yale), Daubechies (Princeton), Donoho (Stanford), Mallat (NYU), and others. Scale-space decomposition is a PDE approach to the multi-scale analysis of images. This theory is rooted in the linear filtering (or heat diffusion) process, and was highlighted by the famous Perona-Malik nonlinear model for removing small-scale noise from images without losing large-scale structures like edges. It was later axiomatized and developed by P.-L. Lions, Morel, and others.

\caption{An example of using nonlinear PDE model for image denoising. (a) is original noisy image and (b) is denoised image using Chromaticity, non-flat feature.}

There are other novel theories and trends for image analysis, including regular-space representations by the BV space, free boundary models, and shape models.

Through this workshop, we intend to summarize different mathematical tools for image analysis, to encourage their interaction, or unification, as well as to understand the limitations of the existing approaches.

\subsection{Image processors}

Given a task, how would you design an image processor that performs efficiently and well? Typical tasks would be: denoising, deblurring, edge detection, intensity enhancement, inpainting and interpolation, and compression and decompression. In addition to these these relatively low-level tasks, there are mid- and high-level tasks like disocclusion, shape from shading, motion analysis, image segmentation, and pattern identification and recognition.

In fields like computer vision and medical imaging, there is great demand for efficient and accurate image processors. In biological vision, the plausibility of the processor is also important --can you build the processor in a manner consistent with known biology? Also, what is a good biological measure of the goodness of your processor?

The first step of processing is to construct a suitable models for the given task. Currently, processing models are being developed using tools like Bayesian decision, variational optimization, inverse problems and the like. These approaches arise from fields such as statistical mechanics, the calculus of variations, nonlinear PDEs, differential geometry and topology, harmonic and functional analysis, and numerical analysis. For example, numerous methods has been proposed for image denoising problem, from using transformations and statistical methods to using PDEs. For transformations, typically

Fourier or Wavelet transformation are used and recently Curvlet or Ridgelet transformation is proposed by Candes and Donoho. On the other hand, one of successful example of the variational and PDE method is the {\em Total Variation}(TV) minimization suggested by Rudin, Osher and Fatemi. These denoising methods are also extended to non-flat features as in researches by Tang, Sapiro and Caselles, Chan and Shen, etc. Another topic in Image reconstruction is inpainting problem. The Digital inpainting problem, proposed by Bertalmio,Sapiro, Caselles and Ballester, is also related to disocclusion problem from vision analysis by Masnou and Morel's level line based disocclusion. Closely related approach using PDE method is TV minimizing inpainting suggested by Chan and Shen. As another example, there are various approaches for the image segmentation problem : using snake methods by Kass, Witkin and Terzopoulos or region growing and merging techniques or global optimization approaches using energy functional or Bayesian approaches.

\caption{An example of image reconstruction tesk, in particular, image inpainting techniques. (a) is original image with yellow letters to be removed and (b) is inpainted, restored image without the letters.}

Once one has a processing model, the next step is its analysis, in order to answer questions of existence and uniqueness, stability, properties of solutions, and so on. Many image processing model are nonlinear or non-convex --- analyzing them requires new mathematical insights. The mathematical reader will not be surprised to hear that relatively few image processing models have been analyzed in this way. There is a vast engineering literature of image processors, which were mostly developed based on empirical insights and leaps of good intuition. They await rigorous mathematical analysis that can be crucial for answering questions of great practical significance, such as which among the many techniques proposed for the same task is superior, what effect the various parameters that appear in a method have on its behavior, and under what conditions a given technique can be expected to perform well. A good example in this context is the work of Lions and his collaborators on the analysis of the Perona-Malik scheme that we briefly mentioned above. This scheme appears to be the finite differences discretization of a nonlinear PDE that has no correct theory (of well-posedness). It was well known, prior to the work of Lions et. al. that, despite its success at its intended purpose the scheme is very sensitive to the presence of noise and the choice of parameters such as the resolution of the digital image -- a fact intimately connected with the lack of a continuum (PDE) theory. The work of Lions etc. replaced the Perona-Malik scheme with one that has all the desirable characteristics of the original, as well as a rigorously established continuum limit. The practical implication is much more stable behavior with respect to the presence of noise and different resolutions.

Finally, what is an efficient algorithm for implementing the processing model? One can certainly develop and implement schemes without performing the type of mathematical analyses described above. But one runs the risk of developing an algorithm that performs poorly in the presence of small perturbations like noise. Also, the history of scientific computing shows that the breakthroughs that led to massive speed-up would have been impossible without a deep understanding of the mathematics. One wants to turn mathematical intellectual power into efficient and robust software. The methods currently being used in image processing come from almost all branches of scientific computing including fast fourier and wavelet transforms, multigrid methods, dynamic programming, combinatorial optimization, computational PDEs, numerical linear algebra, and Monte-Carlo simulations.

We can use the important problem of image segmentation to illustrate how mathematics has contributed to image processing in the three stages we identified above: modeling, analysis, and implementation. The goal of segmentation is to divide up the image domain $\Omega$ into as few and simple pieces as possible, so that image features (such as gray-scale intensity, or color) are approximately constant or slowly varying in each piece. This procedure helps differentiate between parts of the image domain occupied by distinct objects in the scene. It is a challenging problem that is intimately connected with edge detection.

The modeling aspect of this problem saw many fundamental contributions from mathematicians. The outstanding example is the work of D. Mumford and J. Shah, who proposed a variational approach for its solution -- in their technique the segmentation is obtained by finding the minimizer of an energy. More specifically, given an original image $u$ (modeled here as an $L^2$ function), the Mumford-Shah segmentation model calls for the minimization of the following energy:

\begin{equation*} MS(v;K) := \int_{\Omega\setminus K} |\nabla v|^2 \, dx + H^{n-1}(K) + \int_\Omeg a (u-v)^2 \, dx \end{equation*} over functions $v$, and closed subsets $K$ of $\Omega$ across which $v$ is allowed to be discontinuous. Here, $\Omega$ is an $n$-dimensional domain, and $H^{n-1}$ denotes the $(n-1)$-dimensional surface measure. Although there were already variational models of segmentation before the work of Mumford and Shah (namely, Blake and Zisserman's energies), they were all in the discrete setting -- their continuum analogues were not clear.

\caption{An example of image segmentation, in particular, texture segmentation techniques. (a) original image with zebras and (b) segmented image.}

This is a non-typical variational problem, whose analysis led to a wealth of new mathematics. The existence of minimizers for problems of this type, now known as free discontinuity problems, have been obtained by E. De Giorgi, M. Carriero, A. Leaci, L. Ambrosio, and others. The work of A. Chambolle established rigorously the connection between Mumford and Shah's model, and the discrete energies of Blake and Zisserman. Research in this new topic of calculus of variations is still very active. Among recent results, the work of G. Alberti, G. Bouchitte, and G. Dal Maso identify sufficient conditions for minimizers of $MS$. It is worth mentioning that the analytical tools developed in this context can find use in applications other than image processing, since $MS$-like energies have been used to model cracks in solids etc.

Numerical implementation of the Mumford-Shah model has also been a subject of intense mathematical research. As it is written down, the energy $MS$ is very difficult to handle: it requires minimization over subsets $K$ of the domain $\Omega$. The work of L. Ambrosio and V. M. Tortorelli has rigorously shown how to approximate it in the sense of Gamma convergence by elliptic functionals (whose numerical treatment is still not trivial, yet a lot more approachable). In a different vein, the work of T. Chan and L. Vese has shown how the level set method of S. Osher and J. Sethian can be utilized, very effectively, in the minimization of these types of energies.

Some workshop topics

1. Shape Analysis and Reconstruction : from shading, motion, and other visual cues, shape analysis/edge detection/shape from shading

2. Statistical methods including texture analysis/segmentation

3. Image Reconstruction : denoising, deblurring, and inpainting, using any tools from spectral analysis, wavelets, and PDEs.

4. Level set PDE methods and numerical analysis of image processing}