This paper presents a non-iterative deep learning approach to compressive sensing (CS) image reconstruction using a convolutional autoencoder and a residual learning network. An efficient measurement design is proposed in order to enable training of the compressive sensing models on normalized and mean-centred measurements, along with a practical network initialization method based on principal component analysis (PCA). Finally, perceptual residual learning is proposed in order to obtain semantically informative image reconstructions along with high pixel-wise reconstruction accuracy at low measurement rates.

Compressive sensing (CS) is a signal processing technique that enables accurate signal recovery from an incomplete set of measurements (Candes and Tao,

The CS reconstruction process can be observed as a linear inverse problem that occurs in numerous image processing tasks such as inpainting (Bertalmio

After being successfully applied to numerous previously mentioned image processing tasks, machine learning methods started to gain more interest in the area of CS (Mousavi

A deep learning framework based on the stacked denoising autoencoder (SDA) has been proposed in Mousavi

In this paper, we propose an efficient deep learning model for CS acquisition and reconstruction. Our model is based on a fully convolutional autoencoder with a residual network. Fully convolutional architecture alleviates the signal dimensionality problems that occur in the full-connected network design (Mousavi

Although it is well known that normalization of the training data significantly speeds up the training procedure (Ioffe and Szegedy,

Furthermore, we discuss the connection between the linear autoencoder network and principal component analysis (PCA). Based on our observations, an efficient method for initialization of the network weights is proposed. The proposed method serves as a bootstraping step in the network training procedure. Instead of initializing the model using random weights, we propose to use an educated guess for the initial weights by using the PCA initialization method.

Finally, we introduce perceptual loss in the residual network training in order to improve the reconstructions at extremely low measurement rates. Experimental results obtained using the proposed model show improvements in terms of the reconstruction quality.

The paper is organized as follows: in Section

The encoding part of the proposed shallow autoencoder network performs the CS measurement process on an input image, while the decoding part models the CS reconstruction process and reconstructs the input image from the low-dimensional measurement space (Fig.

Proposed design of the CS image reconstruction model. The convolutional autoencoder learns the end-to-end CS mapping. The encoder performs synthetic measurements on the input image, transforming it into the low-dimensional measurement space. The decoding part learns the optimal inverse mapping from the low-dimensional measurements into the intermediate image reconstruction. The residual network additionally improves the initial image reconstruction.

In the traditional CS measurement process, an image is vectorized to form a one-dimensional vector

In this paper, a linear convolutional layer performs decimated convolution, as in Eq. (

The CS reconstruction process is modelled using a transposed convolution (Dumoulin and Visin,

Creating a set of measurement filters from the measurement matrix. Row vector

Visualization of the measurement process using decimated 2D convolution. Block

There are two basic approaches for the measurement matrix design. An arbitrary measurement matrix

Alternatively, the optimal measurement matrix can be inferred from the training data. Such a matrix better adapts to the dataset and preserves more information in the measurements, resulting in better reconstruction results. In our proposal, we optimize the encoding part of the autoencoder to learn the optimal linear measurement matrix

Training neural networks on normalized, mean-centred data became standard in all areas of machine learning (Ioffe and Szegedy,

In order to measure the mean value

Training the neural network on non-mean-centred data has undesirable consequences. If the data coming into a neuron is always positive (e.q.

Training loss function. Normalized mean-centred measurements vs. original measurements. Notice the zig-zagging in the loss function when using non-centred measurement data. Loss functions are visualized on the log scale.

In Lohit

Instead, we propose an efficient initialization method for the deep learning CS models based on the observation from Baldi and Hornik (

Thus, we propose to use the reduced eigenvector matrix

Furthermore, we propose to initialize the reconstruction part of the network using the PCA as well. The eigenvector matrix

The proposed initialization method for the network weights has several advantages. While a neural network has to be retrained in order to obtain the measurement matrix

As previously mentioned, the first part of the proposed network consists of a linear autoencoder. Non-linearities can be easily introduced into the measurement and reconstruction part of the network to further improve the initial reconstruction obtained by the autoencoder. In our proposal, non-linearities are only introduced into the decoding part of the network. Although there are some methods that learn a non-linear measurement operator from the data (Mousavi

Contrast-adjusted visualization of the learned residual for several test images and for the measurement ratio

The output of the proposed convolutional autoencoder represents a preliminary reconstruction of the input image from its low-dimensional measurements. We feed the preliminary reconstruction to a residual network (He

Figure

Reconstructing the high-frequency content in the original image (i.e. edges, texture) is problematic for the linear autoencoder, and the residual network helps to alleviate this problem. Problems occur partly due to the fact that the lower frequency content is dominant in natural images and the learned measurement filters have a low-pass character, and partly due to the choice of the loss function used for training the network. It is known that the MSE loss function yields blurry images (Kristiadi,

In this paper, we fuse the per-pixel reconstruction loss in the autoencoder with the perceptual loss in latent space in the residual network. This is in contrast with Du

Residual learning block. The residual learning block consists of 3 convolutional layers.

Pixel-wise Euclidean loss function for the autoencoder is defined as:

The residual part of the proposed network is trained separately from the autoencoder part using perceptual loss function

In this section, we discuss the details of our network training procedure. We use

Adam optimizer (Kingma and Ba,

We perform series of experiments to corroborate previous discussions and observations. In order to achieve a fair comparison framework, a set of 11 images (

In Section

Comparison of linear autoencoder and PCA in terms of reconstruction PSNR [dB].

PSNR [dB] | ||||

PCA | 31.45 | 27.11 | 23.95 | 20.56 |

Linear autoencoder | 31.39 | 27.06 | 23.92 | 20.55 |

Table

In Fig.

Reconstruction results obtained using linear autoencoder for “

In this section, we compare the proposed CS model to other state-of-the-art learning-based CS methods. To provide a fair comparison, we compare our method only to similar methods which use an adaptive linear encoding part.

We compare our method to the ImpReconNet (Lohit

Reconstruction results obtained using the learned measurement matrix. Table contains mean PSNR reconstruction results for the standard test images at different measurement rates

Mean PSNR [dB] for different methods | ||||

ImpReconNet (Euc) (Lohit |
26.59 | 25.51 | 23.14 | 19.44 |

ImpReconNet (Euc + Adv) (Lohit |
30.53 | 26.47 | 22.98 | 19.06 |

Adp-Rec (Xie |
30.80 | 27.53 | – | 20.33 |

FCMN (Du |
32.67 | 28.30 | 23.87 | 21.27 |

– | – | 19.38 | 18.30 | |

– | – | 16.72 | 16.80 | |

32.00 | 26.36 | 23.67 | 20.51 |

Reconstruction results for “

On one hand, FCMN and ImpReconNet yield similar results in terms of PSNR compared to our method (see Table

Reconstruction results for “

Iterative nature and high computational complexity present the main drawbacks of the traditional CS reconstruction algorithms. Learning based methods for the CS image reconstruction present an efficient alternative to the traditional approach. Average per-image reconstruction time for a set of images with size

Better performance of the learning based methods in the reconstruction phase comes at an increased cost in the training phase. In order to learn the optimal measurement and reconstruction operators, learning based methods require an offline training procedure with a relatively large training dataset. Since learning based methods are data driven, they are also data dependent. Thus, if the statistical distribution of the training dataset significantly differs from the testing data, the performance of the learning based methods will be influenced. Finally, convolutional block image processing is not applicable in imaging modalities where the measurements correspond to the whole signal, and one cannot divide the signal into smaller blocks.

In this paper, we proposed a convolutional autoencoder architecture for the image compressive sensing reconstruction, which represents a non-iterative and extremely fast alternative to the traditional sparse optimization algorithms. In contrast with other learning based methods, we designed a measurement process which enables the model to be trained on normalized, mean-centred measurements which results in a significant speedup of the neural network convergence. Moreover, we proposed an efficient initialization method for the autoencoder network weights based on the connection between the learning-based CS approach and the principal component analysis. The residual learning network was used to further improve the initial reconstruction obtained by the autoencoder.

A combination of a pixel-wise Euclidean loss function for the autoencoder network training along with a Euclidean loss function in the latent space of the