A Study on Impulse Noise Reduction Using CNN Learned by Divided Images

Random noise, which is one of impulse noise, and fixed pattern noise are known to be generated during imaging in a CMOS image sensor. In recent years, fixed pattern noise has been decreased due to the improvement of a CMOS image sensor performance. However, the random noise generated by photon fluctuation in the process of photon detection using photodiode still remains as a problem. So far, many denoising methods have been proposed to remove the random noise in images. In addition, we have already proposed one of denoising methods. However, our method requires an approximate threshold to obtain superior image quality. In this paper, we propose a method to effectively remove the noise superimposed on digital images using Deep Learning, which attracts attention in the field of image recognition and is applied in various fields. In addition, we report results that we compared our method with other conventional denoising methods.


Introduction
Various noises, such as fixed pattern noise and random noise, may be superimposed on the image obtained by a CMOS image sensor (1) .Pixels with different densities are generated in the image when the image is affected by noises.This phenomenon causes a problem of lowering image quality.Fixed pattern noise appears at a spatially fixed position and is caused by a pixel defect of the light receiving portion, dark current of the photodiode, and a variation of the sensitivity of the transistor in the process of manufacturing.
Random noise is caused by dark current shot noise, 1/f noise, and photon shot noise, and this noise appears at random positions of output image.In recent years, fixed pattern noise tends to decrease due to the improvement of image sensor (2) .However, random noise, which is randomly generated in each pixel portion due to the photon fluctuation detected by the photodiode, is still a problem at present.Recently, the amount of light income per unit area received by the light receiving part tends to decrease in a CMOS image sensor which is required to have a high resolution.Thereby the influence of random noise at low illuminance increases.
A number of researches on denosing random noise are widely performed, and one of effective methods is Median Filter (MF) (3) .This method is a process of replacing the pixel of interest by obtaining the median density value from the pixel of the peripheral region.However, since this process is applied to all pixels, non-noise pixels are also subject to processing, which leads to degradation of the image.In addition, extra execution time is generated accordingly.For these problems, one of denosing methods called Switching Median Filter (SMF) has been proposed (4)(5)(6)(7)(8) .In this method, by determined whether the target pixel is noise or not, only noise pixels are successfully denoised.At the same time, the complication of its process, however, has a problem that the processing speed becomes slow.Therefore, we have already proposed Multi-directional Switching Median Filter (Multidirectional SMF), which denoises with the two-step method of multi-directional scanning and average processing using 2×2 noise detection operator (9) .
A switching type method, such as SMF and Multidirectional SMF, needs a threshold to judge whether the target pixel is noise or not.Since the optimal threshold value differs from each image, however, it is difficult to find each optimal threshold value.Thus, research on this decision method is still underway (10) .In this paper, we propose a novel denoising method using Deep Learning, which has been applied in various fields in recent years and can extra feature value automatically, instead of the method using the threshold.

Proposed Method
Fig. 1 shows the overview of our proposed method.The feature of this method is that split images with superimposed noise are adopted for input images to be learned.The advantages of this method are as follows; it can be adapted to arbitrary image size, it is possible to increase the number of learning images, and it has the high efficiency of denoising.This method consists of two Convolution-layers and two Deconvolution-layers.In addition, it is mean square error as loss function.

Convolution-layer
First, convolution in an image will be described (11) .The subject of this examination is grayscale image, and its gray value is stored in each pixel.The image size is × pixels, and the pixels are represented by index (, )( = 0, … ,  − 1,  = 0, … ,  − 1).The value of pixel(, ) is  01 , and a real value including a negative value is taken.Learning images are filtered by using the × size filter.The pixel of filter is represented by index(, )( = 0, … , H − 1,  = 0, … , H − 1) , and its value is represented by ℎ 78 .ℎ 78 takes an arbitrary real value.Convolution of an image, which is a product-sum calculation defined between an image and a filter is given as: The convolution-layer is a single layer network which performs the convolution operation described above.Fig. 2 shows the overview of convolution-layer and the detail in the process of convoluting several filters to grayscale image.The convolution-layer of Fig. 2 where  01C is added as a bias.This bias is set so that  01C =  C does not depend on the position of the pixel.We obtain  01C by applying the activation function of the following equation ( 3) to obtained  01C .
The activation function is a nonlinear function or an identity function after linear transformation in a neural network.Since the ReLU function is often used for the activation function, ReLU function was also used in this paper.This ReLU function is expressed by the following as: = max (0, ) (4)  pieces of  01C are outputted by  kind of filters, when all of them are matched, it becomes the output  01C (A) of the convolution-layer.The features of the convolution-layer are as follows; the data can be compressed while considering the relationship of the data in region, and the filter necessary for Fig. 1.Overview of Our Proposed Method.Fig. 2. Overview of Convolution-layer (11) .feature extraction can be learned automatically.

Deconvolution-layer
The Deconvolution-layer has the role of upsampling the input feature map (12) .The processing of the filter calculation itself is the same as that of the convolution-layer, and it is the processing of enlarging the image while maintaining dense information in the whole image after feature extraction by the encoder.

Loss Function
The loss function is a function that returns a loss with the predicted value y and correct answer  as arguments (13) .It is used in final layer of neural network, and the value of the loss function decreases when the difference between the predicted value and the correct value decreases.As loss function, the mean square error is used.The mean square error is given as: where  represents learning samples, and  0 T represents the -th dimension of the predicted value for -th learning sample.
In this paper, the correct answer data is the original image and the learning data is the original image with noise added.Learning is performed so that the output image as the predicted value approaches the original image as the correct value.In order to minimize the loss, the gradient method is used for Deep Learning.

Experiment Enviroment
Table 1 shows the experimental environment.

The framework of Deep Learning used
In order to implement the proposed method, we used Chainer, which is one of the frameworks of Deep Learning (14) .Chainer is a library that learns neural networks with Backpropagation, developed by Preferred Networks Inc.The version is 1.18.0.In addition, Chainer has following features.l Offered as Python programing language library l Flexible correspondence to the structure of every neural network l Intuitive code by dynamic calculation graph construction l Support for GPU, learning using multiple GPUs can also be described intuitively

Model of Noise
It explains random noise used in this paper.Random noise is defined as: The noise model is expressed as: where (, ) is the degradation signal, and  `(, ) is the signal of the original image.At this time, (, ) is a uniform number that takes the value of the interval (,  + ),  is the probability that the value is selected, and  is ratio of the noise.

Learning Method
For learning, 5,200 images of 256×256 pixels and 8bit grayscale randomly extracted from ImageNet (large-scale image dataset) are used.Learning is performed with five kinds of image division sizes as follows: 8×8, 16×16, 32× 32, 64×64 and 128×128 pixels.The noise added to image during learning is random noise, which is five kinds of noise density as follows:10%, 20%, 30%, 40% and 50%.To optimize the network parameters, the Adam solver is adopted (15) .The step size is started from 0.001.The minibatch size is set to 100.The number of learning is set at 300 times.Two points are evaluated: the division size of image suitable for image denoising, and image quality of output image when the noise density is changed as image data used for learning.
In addition, comparison is made with three kinds of conventional denoising methods, MF (median filter), SMF (switching median filter) and multi-directional SMF, which is developed in our laboratory.

Evaluation Method
PSNR (Peak Signal to Noise Ratio) is used as an index for quantitatively evaluating image quality in denoising experiments (16) .PSNR is expressed as: Since the quality of image becomes better in inverse proportion to the loss between the original image and the restored image, the larger the value of PSNR is, the better the image quality becomes.
In this paper, twelve kinds of images of SIDBA (Standard Image Data-BAse) are used as test images.Fig. 3 shows the test images.Random noises are added to test images, and the density values of these noises are as follows: 10%, 20%, 30%, 40% and 50%.Next, test images are denoised using model, which is learned by the each division size of image and the added noise density.After these processes, the PSNR of each result image is measured and the average of PSNR is calculated.

Result of Proposed Method
Fig. 4 shows the result of denoising images with random noise by each division size, with the above mentioned density values of 10%, 20%, 30%, 40% and 50%.Image denoising was performed by using the result of learning images with the noise added at the same rate of each random value noises.According to Fig. 4, it is found that the average value of the PSNR of the resultant image with the division size of 8×8 pixels is the lowest.The results of two cases, division size 32×32 and 64×64 pixels, were almost the same, and the best results were obtained.Therefore, the suitable division size for image denoising is considered to be 32×32 or 64×64 pixels.Fig. 5 shows comparison of denoising performance with proposed method and conventional methods.The result of the image division size of 64×64 pixels was used as the result of the proposed method.
According to Fig. 5, the result of the proposed method is the best result.Especially, when noise density is 10%, the result of proposed method is about 3dB higher than the Multi-directional SMF, which shows the best result among the conventional methods.In addition, when noise density is 20% to 50%, the result of proposed method is about 2dB higher than Multi-directional SMF.
In addition, in the case of the low density of noises, the result is obtained that the image quality tends to be good when the division size of images used for learning is small.On the other hands, in the case of the density of noises is contrarily high, the larger division size of images to remove the noises, the image quality is better.
In order to explain this reason, the original image of the image Lenna divided into 16×16 pixels and the image added with noise are shown in Fig. 6.Similarly, the original image divided into 128×128 pixels and the image added with noise are shown in Fig. 7.According to Figures 6 and 7, in the case of the 10% of the noise density, the division size of 16×16 pixels is possible to distinguish the edge from the noise better than the division size of 128×128 pixels.However, in the case of the 50% of the noise density, it is difficult to become distinguished the edge from noise when division size of 16×16 pixels is used as image size for learning of configuring the neural network.Because the ratio of the noise added on edge part increase more compared to 10% noise.Conversely, it is considered that the division size of 128×128 pixels is easier to distinguish noises from the geometric characteristics existing in an image than the division size of 16×16 pixels.
Therefore, it is better to enlarge the division size of the image used for learning as the ratio of noise increases.

Comparison of denoising performance
Next, the image with noise reduction is to be shown.An image called Lenna was used as the representative of the test images.Fig. 8 shows original image and images with random noise, and these noise densities are as follows: 10%, 30% and 50%.Figures 9, 10 and 11 show the results of image denosing by conventional methods and the proposed method (64×64-pixels).
According to Fig. 9, in case of noise density is 10%, MF can remove almost all the noises, but the edge of image is blurred.On the other hand, SMF and Multi-directional SMF keep the form of edges clear, but the noises are not completely removed partially.However, the proposed method can remove noises while maintain the clear form edges.
According to Fig. 10, in case of noise density at 30%, Multi-directional SMF and the proposed method remove noises more effectively than MF and SMF.In addition, the proposed method hold edges clearer than SMF.
According to Fig. 11, in case of noise density at 50%, the proposed method is the best way for image denoising.At other different noise ratios, image edges are not kept clear by the proposed method, and consequently, the image becomes like a mosaic image.Therefore, it remains as a future task.

Conclusions
We proposed a method of image denosising using Deep Learning and compared it with conventional methods.From experimental results, it was found that the division sizes of the suitable image for image denoising are 32×32 and 64×64 pixels.Moreover, we found that the proposed method is superior to the conventional method at all noise density.Furthermore, we found that the proposed method can remove noise while maintaining the edge more than the conventional method.However, in this paper, noise is removed by using the result learned with the same addition rate as the noise added to the test image.
For that reason, we would like to discuss future cases where the noise addition ratio of the image used for learning is different from the noise addition ratio of the image that actually performs image denoising.
In addition, this paper focused on the performance of image denoising against division size of image.In the future, we will also investigate about the relationship between the learning time and the division size of image, and carry out the experiment on the speed of proposed method compared to other methods.

Fig. 7 .
Fig. 7. Original image and the noise added image of Lenna are divided into 128×128 pixels.

Fig. 6 .
Fig. 6.Original image and the noise added image of Lenna are divided into16×16 pixels.

Fig. 8 .
Fig. 8. Original Image and Noised Images for the experiments.

Table 1 .
) Experiment Enviroment.whereMSE is mean square error between the original image and the restored image.M and N are the sizes of image,  01 is the density of the original image and  01 is the density of the restored images at (, ).