An Efficient Total Variation Minimization Method for Image Restoration

In this paper, we present an effective algorithm for solving the Poisson–Gaussian total variation model. The existence and uniqueness of solution for the mixed Poisson–Gaussian model are proved. Due to the strict convexity of the model, the split-Bregman method is employed to solve the minimization problem. Experimental results show the effectiveness of the proposed method for mixed Poisson–Gaussion noise removal. Comparison with other existing and well-known methods is provided as well.


Introduction
Image acquisition is an ubiquitous technology, found for example in photography, medical imagery, astronomy, etc. Nevertheless, in almost all situations, the image-capturing devices are imperfect: some unwanted noise is added to the signal. Therefore, the obtained images are post-processed by numerical algorithms before being delivered to the users; those algorithms have to solve the image restoration problem.
In the image restoration problem, an original image u is corrupted by some random noise η, resulting in a noisy image f . Our task is to reconstruct u, knowing both f and the distribution of η. Of course, there is in general no way to find the exact image u; image restoration algorithms rather yield a good approximation of u, usually noted u * . To do so, they exploit a priori knowledge on the image u.
A growing interest in Poisson-Gaussian probabilistic models has recently arisen (Chouzenoux et al., 2015). The mixture of Poisson and Gaussian noise occurs in several practical setups (e.g. microscopy, astronomy), where the sensors used to capture images have two sources of noise: a signal-dependent source which comes from the way light intensity is measured; and a signal-independent source which is simply thermal and electronic noise. Gaussian noise is just additive, so it cannot properly approximate the Poisson-Gaussian distributions observed in practice, which are strongly signal-dependent.
In general, the mixed Poisson-Gaussian noise model can be expressed as follows: where f is observed image, u is the unknown image, P(u) means that the image u is corrupted by Poisson noise, and W ∼ N (0, σ 2 ) is a Gaussian noise with zero mean and variance σ .
Recently, several approaches have been devoted to the mixed Poisson-Gaussian noise model (Foi et al., 2008;Jezierska et al., 2011;Lanza et al., 2014;Le Montagner et al., 2014). Many algorithms for denoising images corrupted by mixed Poisson-Gaussian noise have been investigated using approximations based on variance stabilization transforms (Zhang et al., 2007;Makitalo and Foi, 2013) or PURE-LET based approaches (Luisier et al., 2011;Li et al., 2018). Variational models based on the Bayesian framework have been also proposed for removing and denoising and deconvolution of mixed Poisson-Gaussian noise (Calatroni et al., 2017). This framework is perhaps a popular approach to mixed Poisson-Gaussian noise model. Authors in De Los Reyes and Schönlieb (2013) proposed a nonsmooth PDE-constrained optimization approach for the determination of the correct noise model in total variation image denoising. Authors in Lanza et al. (2014) focused on the maximum a posteriori approach to derive a variational formulation composed of the total variation (TV) regularization term and two fidelities. A weighted squared L 2 norm noise approximation was proposed for mixed Poisson-Gaussian noise in Li et al. (2015), or an efficient primal-dual algorithm was also proposed in Chouzenoux et al. (2015) by investigating the properties of the Poisson-Gaussian negative log-likelihood as a convex Lipschitz differentiable function. Recently, authors in Marnissi et al. (2016) proposed a variational Bayesian method for Poisson-Gaussian noise, using an exact Poisson-Gaussian likelihood. Similarily, authors in Calatroni et al. (2017) proposed a variational approach which includes an infimal convolution combination of standard data delities classically associated to one single-noise distribution, and a TV regularization as regularizing energy. Generally, image restoration by variational models based on TV can be a good solution to the mixed Poisson-Gaussian noise removal with the following formula (Calatroni et al., 2017;Pham et al., 2019): where f is the observed image, ⊂ R 2 is a bounded domain, and S( ) is the set of positive functions from to R; finally, λ 1 , λ 2 are positive regularization parameters (see Chan and Shen, 2005, for details on this method).
However, in some cases, intermediate solutions of (2) obtained during the execution of algorithms may contain pixels with negative values. To avoid this problem, authors in Pham et al. (2018) proposed a modified scheme of gradient descent (MSGD) that guarantee positive values for each pixel in the image domain.
In this work, we focus on the model (2) and consider the following model: where f is the observed image, λ 1 and λ 2 are positive regularization parameters, S( ) = {u ∈ BV( ) : u > 0} is closed and convex, with BV( ) being the space of functions → R with bounded variation; and finally α(x) is a continuous function in S( ). The function α(x) is used to control the intensity of the diffusion, which is an edge indicator for spatially adaptive image restoration (Barcelos and Chen, 2000). Typically, the function α(x) is chosen as follows: where l is a threshold value and v(x) = |∇G σ (x) * f |, in which * denotes the convolution with G σ (x) = 1 2πσ 2 exp − x 2 2σ 2 , i.e. the Gaussian filter with standard deviation σ . The main contributions of this paper are the following. We give an elementary proof of the existence and uniqueness of model (3). Moreover, we check that the functional E(·) is convex, which enables us to use larger time-step parameters during gradient descent when solving (3). We introduce the influence function α(x), which acts as an edge-detection function, to get the model (3) in order to improve the ability of edge preservation and to control the speed of smoothing. In addition, we propose a new method to solve (3) that perceptibly improves the quality of the denoised images. By changing the time-step parameter, users can either get faster denoising with comparable results to previous methods, or better quality denoising with comparable running times. Our method is a technical improvement over the split-Bregman algorithm. We report experimental results for the aforementioned method, for various parameters in the noise distribution. The quality of denoising is measured with the SSIM and PSNR metrics. If we tune the time-step parameter to get similar quality result as the original split-Bregman method, we get faster running times.
The rest of the paper is organized as follows. In Section 2, we describe the Poisson-Gaussian model and introduce the notation used in this work. In Section 3, we prove the existence and uniqueness of the solution. In Section 4, using the split-Bregman algorithm, we present the proposed optimization framework. Next, in Section 5, we show some numerical results of our proposed method and we compare them with the results obtained with other existing methods. Finally, some conclusions are drawn in Section 6.

Preliminaries
We recall the principle behind equation (2). Note that the contents of this section are not a rigorous proof; we simply provide a bit of context around the equation, why it was considered in the first place, and one possible reason for its practical efficiency. We also state our assumptions on both the initial image and the noise along the way.
Our goal is to recover the original image u, knowing the noisy image f . Our strategy is to find the image u which maximizes the conditional probability P (u|f ). Bayes's rule gives: The probability density function of the observed image f corrupted by Gaussian noise P N (respectively, by Poisson noise P P ) is: where σ is the variance of the Gaussian noise. As we explained in the introduction, the two sources of noise are independent of each other, so the distribution of the mixed noise may be expressed as: We assume that the values of the pixels in the original image are independent, and that the noise is also independent on each pixel.
Maximizing P mixed is equivalent to minimizing − log P mixed , so let us compute the quantity − log(P mixed (f |u)): for some constant y > 0. In the above equation, u varies but f is constant. Since our goal is to minimize the whole expression, we can ignore the term log(f i !) altogether. Now we assume that P (u) follows a Gibbs prior (Le et al., 2007): where z is a normalization factor. We need to make a couple of comments here. First, u is not a function R 2 → R, but rather a discrete array of pixels; thus the integral in that expression is going to be translated to a sum, while ∇u will be translated as a linear approximation. Second, this assumption appears to contradict the previous one, that the pixels of the original image are independent of one another. However, the assumption on P (u) is local: each pixel depends (weakly) on the neighbouring pixels only, so we do not lose much by assuming independence. This turns out to yield good results in practice (Chan and Shen, 2005). We now have all the ingredients to maximize P (u|f ). By equation (4), this amounts to minimize the expression − log(P (f |u)) − log(P (u)), so we can plug in equations (5) and (6) to get: and we can view this expression as a discrete approximation of the functional E(·) defined as: with λ 1 = 2yz and λ 2 = z. (We multiplied by z, which is positive and constant, so the minimum is the same.) Intuitively, the last two terms are data fidelity terms, which ensure that the restored image u is not "too far" from the original image u (taking the distribution of the noise into account). By contrast, |∇u| is a smoothness term, which guarantees that the reconstructed image is not too irregular (this is where our a priori knowledge on the original picture lies). The parameters λ 1 and λ 2 will have to be determined experimentally later on. In the following sections, we introduce some theoretical results about the existence and uniqueness result for solution of (3).

Existence and Unicity of the Solution
Motivated by Aubert and Aujol (2008), Dong and Zeng (2013), we have the following existence and uniqueness results for the optimization problem (3). We prove that (3) has an unique solution in two steps: first, we show that E(·) is a convex functional; then, we show that E(·) has a lower bound. These two facts together imply the existence and uniqueness of the minimizer of E(·).
The first and the second order derivative of h are: Since f is a positive, and u ∈ S( ), we have: h (u) > 0, i.e. h(u) is strictly convex. Moreover, the TV regularization is convex, thence E(u) is also strictly convex.
Theorem 2. Let f ∈ S( ) ∩ L ∞ ( ), then the problem (3) has an exactly one solution u ∈ BV ( ) and satisfying: Fixing x ∈ and denoting the data fidelity term with h on R + , where Easily, we have that the first order derivative of g satisfies: The function g decreases if t ∈ (0, f (x)) and increases if t ∈ (f (x), +∞). Therefore, .
Furthermore, from Kornprobst et al. (1999), we have: In the same way, we have: E(sup(u, a)) E(u), where a = inf(f ). Thence, we can assume a u n b, the sequence {u n } is bounded in L 1 ( ).
Since {u n } is a minimizing sequence, we know that E(u n ) is bounded. Hence, also the regularization term |∇u| is bounded and {u n } is bounded in BV ( ).
There exists u * ∈ BV ( ) such that up to a subsequence, we have that u n converges weakly to u * ∈ BV ( ) and u n converges strongly to u * ∈ L 1 ( ). We have S( ) is closed and convex. Using 0 < a u * b, the lower semicontinuity of the total variation and Fatou's lemma, we get that u * is a minimizer of the problem (3).

Discretization Scheme
Our scheme allows to perform both deblurring and denoising simultaneously. To do so, we need to compute: where K is a blurring operator (convex), f is the observed image, S( ) is the set of positive functions defined over with bounded total variation, and λ 1 , λ 2 are positive regularization parameters. This functional u → E(u) is still strictly convex, because K is assumed to be convex. The images we are handling are discrete, i.e. matrices of pixel values rather than functions from R 2 → R. Therefore we have to choose a discretization scheme for numerical computations. If u is a image, we write u j,k for the pixel at coordinates (j, k) in u. We define the following quantities: where ε is a small positive number, added to avoid divisions by 0 in the implementation of the algorithms. Finding a minimum for the problem (2) can be achieved via the steepest gradient descent method The operator divergence div ∇u |∇u| is defined by where ∇ 11 u j,k = ∇ 1 (∇ 1 u j,k ) = u j +1,k − 2u j,k + u j −1,k , Thus, for a small parameter δt > 0, a solution of the minimization problem (2) may be computed by When the time-step parameter δt becomes small, the convergence speed becomes so slow that larger images are proceeded with poor efficiency. There are many methods (Chambolle, 2004;Micchelli et al., 2011;Boyd et al., 2010) which can be used for the minimization problem in (2). In this paper, we extend the split-Bregman algorithm (Goldstein and Osher, 2009) to solve the minimization problem.

Proposed Algorithm
First, let us first review the split-Bregman algorithm (Goldstein and Osher, 2009). Suppose that we have a scalar γ and two convex functionals (·) and G(·); and that we need to solve the following constrained optimization problem: find arg min We convert (10) into an unconstrained problem: find arg min where ρ is a penalty parameter (a positive constant) and b is a variable related to the split-Bregman iteration algorithm (to be explicited later). The solution to problem (11) can be approximated by the split-Bregman Algorithm (Goldstein and Osher, 2009): Now we return to the problem (9). We define We set ν = Ku; then, based on equation (11), the split-Bregman problem for (9) is defined as: arg min where the parameters ρ 1 , ρ 2 and γ are positive, d = (d 1 , d 2 ), b = (b 1 , b 2 ) and ∇u = (∇ 1 u, ∇ 2 u). The split-Bregman method for solving (12) is described as follows: There are three subproblems to solve: u, ν and d.
Subproblem 1. The u subproblem is a quadratic optimization problem, whose optimality condition reads: under considering periodic boundary conditions. Note that left-hand-side matrix in (13) includes a Laplacian matrix (∇ T 1 ∇ 1 + ∇ T 2 ∇ 2 = − ) and is strictly diagonally dominant. Following Wang et al. (2008), equation (13) can be solved efficiently with one fast Fourier transform (FFT) operation and one inverse FFT operation as: where F and F −1 are the forward and inverse Fourier transform operators.
Subproblem 2. The optimality condition for the ν subproblem is given by This equation can be rewritten as: The positive solution is given by where Subproblem 3. The solution of the d subproblem can readily be obtained by applying the soft thresholding operator (see Micchelli et al., 2011). We can use shrinkage operators to compute the optimal values of d 1 and d 2 separately: The problem (16) is solved as: The algorithm. The complete method is summarized in Algorithm 1. We need a stopping criterion for the iteration; we end the loop if the maximum number of allowed outer iterations N has been carried out (to guarantee an upper bound on running time) or the following condition is satisfied for some prescribed tolerance ς : where ς is a small positive parameter. For our experiments, we set tolerance ς = 0.0005 and N = 500.

Implementation Issues
In this section, we show some numerical reconstructions obtained applying our proposed method for mixed Poisson-Gaussian noise. We compare our reconstructions with other Algorithm 1 Adaptive split-Bregman algorithm for solving the model (9). Initialize: Compute u (k) using (14) Compute ν (k) using (15) Compute d (k) i for i = 1, 2 using (17)  images obtained other well known methods, such as TV-L 1 (Chambolle et al., 2010), the Modified scheme for Mixed Poisson-Gaussian model (MS-MPG) (Pham et al., 2018) and the Bregman method (Goldstein and Osher, 2009). All of the compared methods perform image denoising with their optimal parameters. For a fair comparison, the regularization parameters of both MS-MPG and our proposed are the same: λ 1 = 0.4, λ 2 = 0.6. We set ρ 1 = 1, ρ 2 = 1. The parameter σ in α(x) is set to 1. The threshold value l in the function α(x) and the parameters γ are chosen to keep the poise between noise removal and detail preservation capabilities.
The test images 1 are 8-bit gray scale standard images of size 256 × 256 shown in Fig. 1.
All the experiments were run on a machine with Core i7-CPU 2 GHz, SDRAM 4 GB-DDR III 2 Ghz, Windows 10 (64 bit), and implemented in MATLAB. To compare the efficiency of algorithms, we use the Peak Signal-to-Noise Ratio (PSNR) and the Structure Similarity Index (SSIM) (Wang and Bovik, 2006).

The first metric, PSNR (db), is defined by
where u, u * are, respectively, the original image and the reconstructed (or noisy) image, I max is the maximum intensity of the original image, M and N are the number of image pixels in rows and columns. The second metric, SSIM, is defined by where μ u , μ u * are the means of u, u * , respectively; σ u , σ u * , their standard deviations; σ u,u * , the covariance of two images u and u * ; c 1 = (K 1 L) 2 ; c 2 = (K 2 L) 2 ; L is the dynamic range of the pixel values (255 for 8-bit grayscale images); and finally K 1 1, K 2 1 are small constants.

Image Denoising
Our method can perform image deblurring and denoising simultaneously. In this section, we perform only image denoising. Noisy observations are generated by Poisson noise with some peak I max and by Gaussian noise with standard deviation σ G to the test images. In Figs. 2, 4 and 5, we give the results for denoising the corrupted images for different noise levels I max and σ G = 10.
For a better visual comparison, we have enlarged some details of the restored images in Figs. 3, 6 and 7 (we include the original images in the first column). It can be seen that our method gives even better visual quality than other methods. Table 1 shows the computation time in second(s) of the compared methods for Fig. 2. We see from Table 1 that the computation time of the restored images by the proposed method and the Bregman method is about the same. However, the computational time required by the proposed method is less than that required by the MS-MPG and TV L 1 . The comparison metrics PSNR, SSIM are also computed using various noise levels and shown in Table 2 and  Table 3. The best values among all the methods are shown in bold. We give the values of the PSNR and SSIM for the noisy and recovered images. The results shown in Tables 1, 2 and 3 prove that the proposed method is convergent and gets higher PSNR and SSIM values than others.

Image Deblurring and Denoising
In this section, we perform image denoising and delurring simultaneously. In our simulation, we use the Gaussian blur with a window size 9 × 9, and standard deviation of 1. After the blurring operation, we corrupt the images by Possion noise I max = 120 and σ G = 15. As in the previous experiment, we compare our results with those obtained by employing            In Table 4, we give the values of the PSNR and SSIM for different images and different variational methods. The best values among all the methods are shown in bold. Comparing the values of the PSNR and SSIM, we can clearly see that our method outperforms the others even in presence of blur.

Conclusion
In this paper, we have studied a fast total variation minimization method for image restoration. We propose an adaptive model for mixed Poisson-Gaussion noise removal. It is proved that the adaptive model is strictly convex. Then, we have employed split Bregman method to solve the proposed minimization problem. Our experimental results have shown that the quality of restored images by the proposed method are competitive with those restored by the existing total variation restoration methods. The most important contribution is that the proposed algorithm is very efficient.
C.T. Pham received his PhD degree in engineering sciences from Tula State University, Tula, Russia, in 2016. He did a postdoctoral fellowship at Centre of Deep Learning and Bayesian Methods, National Research University Higher School of Economics, Moscow, Russia. He is currently a lecturer at the Faculty of Information Technology, The University of Danang -University of Science and Technology, Danang, Vietnam. His research interests include image processing, machine learning.
T.T.T. Tran received a MS degree in applied mathematics and computer science from Tula State University, Tula, Russia, in 2018. She is currently a lecturer at the Faculty of Statistics and Informatics, The University of Danang -University of Economics. Her research interests include fractal, percolation, image processing, machine learning.
G. Gamard received his PhD degree in computer science from University of Montpellier, in 2017. He is currently a member of the ENS of Lyon. His main research interests are discrete mathematics, dynamical systems, sampled signals, data and image processing.