Toward Image Restoration Based on Neural Network and Similarity Comparison

This paper presents a method to fill in missing data in an image. In this research, the mask template window was setup. The order of missing pixels required for imputing is calculated. After that data in this template was trained by using Neural Network. The missing pixels were imputed by values which already trained by neural network. The experimental results show that in several cases our proposed algorithms outperform traditional methods.


I. INTRODUCTION
One of the most challenging problem in nowadays is the occurring of incomplete data in many fields of data set, for example, image data sets, geostatistics data sets, and timeseries data sets.The imputation methods for reconstructing damaged areas have been extensively studied.Most of the conventional methods used the nearest pixels for restoring the damaged areas.In recently works, several different imputation methods have been proposed.We can group the method which is used to impute the missing area inside damaged image into 3 groups: machine learning based algorithm, statistics based algorithm and image processing based algorithm.
In the first group, machine learning based algorithm, for example: Takahiro et al. [2] used an algorithm based on a kernel principal component analysis(PCA)-based projection onto a convex set.In this algorithm, a non-linear eigenspace was constructed from each kind of texture and the optimal subspace for the target local texture was introduced into the constraints of the POCS algorithms.A limitation of this algorithm is that the size of the local image and the number of clusters are set manually.
In the second group, statistics based algorithm were used for imputing the missing values.Most of methods in this group used the clustering similarity measure.The most well-known method for comparing is Ward-Walfowitz test of randomness.The concept of this technique is using a run test calculated from a consecutive sequence of identical label between two clusters.The number of runs is used as the test statistics [1].Ward-Walfowitz test of randomness was used for finding the most similar area which comparing between the surrounding damaged area and other areas.After that the most similar area was used for imputing the missing area.Hu et al. [8] proposed a method for image imputation for the case of the missing area in an image is surrounded by characters by using the eigenspace.They used the BPLP method based on the self-correlation in the image using only one image.However, parameters such as the size of the clipping window and the step width of window were experimentally determined.These parameters may affect the precision of interpolation.
In the third group, image processing based algorithm, for example: Grover et al. [3] introduced a technique for filling in a missing area by using texture filling based on a Laplace equation with Dirichlet boundary condition.The main asset of this method is that it uses a bounded search window.The method is able to fill in a missing area from outside to inside.Hui et al. [4] proposed a regularization based approach to recover degraded images by enforcing the analysis-based sparity prior of images in tight frame domain.Telea [5] proposed an image inpainting algorithm based on the fast marching method.The limitation of this algorithm is the blurring produced when inpainting points are thicker than 10-15 pixels [5].Huan et al. [13] introduced an image inpainting algorithm based on a two-step process.In the first step, the filling order of the missing pixels was determined by using the fast marching method.In the second step, a block of textures was computed by using a search process and an SSD measurement.Criminisi et al. [7] developed an exemplar-based inpainting method which used the magnitude of the gradient and the observed pixels of image to define the filling order in the target region.The missing region of the image was filled with source patch blocks ordered by priorities.This method is an efficient imputation method which is able to preserve the linear structure in the missing area.However, the filling order in the method is random and unreliable and it seems to have the phenomenon of growing garbage [13].Bertalmio et al. [11] proposed an image inpainting method based on smooth propagation of information from the surrounding areas in the isophotes direction.
To impute the missing pixels inside a damaged image in this research, the hybrid imputation of missing image using neural network and a similarity measurement based on the characteristics of damaged areas was used.The method is based on the assumption that the properties of a missing pixel should be closer to the properties of nearby observed pixels than to properties of further distant pixels.Moreover, if the missing area is large, then we have found that the neural network method cannot be used for imputation in some case.Instead, we impute the pixels in the missing area by finding a similar area in the original image by using a clustering method.Clustering is a method for classifying a data set into groups of similar data sets.The process for clustering data set is based on clustering similarity measure.There are several clustering similarity measure between two clusters.Torres [10] proposed a similarity measure by using the Euclidean distance between cluster centroids and pearson correlation.Bae [6] proposed the new measure name Attribute Distribution Clustering Orthogonality(ADCO) which consider the density profile for each attribute.This method considered distribution information of data points in each attribute, and the shape of each cluster.Dong [14] proposed new similarity measure by using a cosine similarity-based negative selection algorithms for time series detection.All of these methods were used for similarity measure between two clusters.
In this paper, we proposed a feasible method for imputing incomplete image.Our proposed method based on neural network and the similarity of each group of data.Details of our concept will be discussed in Section II.The experimental setup is described in section III.In Section IV, the effectiveness of our method is verified by results of the experiments.Concluding remarks are presented in section V.

A. Problem Formulation
Let M be a matrix of a pixels inside an image, where p i is the attribute vector for pixel i, where c is number of attributes and n is the number of pixels.
We assume that in each p i is composed of two parts which are called input attributes and output attributes.A set of vectors p i = (x i , y i , z i ) for pixel i, 1 ≤ i ≤ n, where (x i , y i ) gives the position of pixel i and z i gives the intensity of pixel i.In this case, (x i , y i ) are called input attributes and z i is called an output attribute.The relationship between these two parts of data set is where the output attribute z i of pixel i (the intensity) is regarded as a function of the input attributes (x i , y i ) (the position) of pixel i.
In this paper, we assume that missing data consists only of missing output data, i.e., that values of z i are unknown for some set of pixels but that corresponding (x i , y i ) values are always known.If the output data of a pixel i is missing, then its attribute vector is written as p m i and it will be called a target vector for imputation.The processes for imputing the missing pixels can be described as follow.
The first step is setting the mask template window starting with 3x3 pixels.The orders of missing pixels required for imputing is calculated.These orders are kept into a queue by using the marching method [9][12] [13].The missing pixels are extracted from this queue after the orders of missing values are calculated.At this position, the surrounding pixels of missing values are checked.If the missing pixels are covered with the observed pixels in all direction as shown in Fig. 1, then a neural network is used for imputing the missing values.If that area cannot be imputed with the neural network because of the large size of damaged area then the similarity technique is used for imputing missing pixels.

B. Adjustable neural network
In this section, a new method is introduced for imputing missing values based on an adjustable neural network for the primary imputation step.The neural network process for imputing missing values can be described as follows: Let I be a matrix of image data set with n ×c pixels.Let p i be a vector for pixel i at position x i , y i .Let (x i , y i ) be the input attribute of vector p i .If the output data of a pixel i is missing then its attribute vector is written as p m i .If a missing pixel occurs in a position (x i , y i ) then set the mask template window with size of w × w pixels, denoted with T .The size of this window is calculated by the experiment because this size can cover a large number of observed data surrounding the damaged area.Next, superimpose position (x c , y c ) of mask template window at position x i , y i of the missing value p m i , where (x c , y c ) is a center of mask template window and (x i , y i ) is the position of missing pixel.After the suitable w value for window size are selected, the observed position pattern inside the window mask template is searched.
Fig. 1 shows an example of mask window template that will be used in the imputation processes that is using data in T 2 , T 4 , T 6 , T 8 position for becoming a training sample.After the training sampling have been found, the intensity of these pixels are used as a target.Consider the following case.If a pattern of missing pixels inside mask window template occurs as shown in Fig. 1, the training data set will have only four inputs.Another case is if a pattern of missing pixel inside mask window template have more than four observed values in window mask template such as the occurring of observed data at positions T 6 , T 7 , T 8 , and T 9 , use this data to be a training sample in the adjustable neural network.So, not only to have only four points but also to have more than four points for the training process.Number of patterns to be used in the training process depends on the observed data in the mask window template, for example 4, 5, 6, 7, and 8 patterns.1.Let p i be a vector for pixel i at position (x i , y i ), where (x i , y i ) gives the position of pixel i and z i gives the intensity of pixel i. p m i denote output data of pixel i is missing.2. Let m i be an index vector to denote whether or not data at position (x i , y i ) is missing.3. Let m be a number of missing pixels and n be a number of pixels in an image.4. Let k be a number of nearest neighbors of vector p m i . 5. Let (x i , y i ) be an input attribute of pixel i, z i be an output attribute of pixel i. 6. for i=1 to n do 7.
if a missing data m i exists then 8.
Let K be a set of nearest neighbors of vector p i for pixel i at position (x i , y i ) denoted by (x j , y j ) ∈ K. 9.
Find k nearest neighbors of vector p i based on minimum Euclidean distance 10.
Suppose k nearest neighbors of missing data Use the following data set to be a training pattern: where t is an index of nearest neighbors of a missing pixel, 1 ≤ t ≤ k. 12.
Train only on the training set by setting the stopping criteria and the network parameters.13.
Stop training as soon as the error equals the predefined mean square error.14.
Use the weights in the previous step for the imputation step by using the following data, (x m , y m ) is input pattern.z m is desired output.15.
Use z m to impute a missing value p m i .16. end 17.end

C. Similarity measurement between two clusters.
The similarity measurement could be performed by measuring the similarity of cluster's shape and cluster's distribution of the two groups.This technique can be used when either the large size of damaged area occurred or the neural network cannot be used to perform the imputation of damaged image.
To measure the similarity between two sub-images, the density properties of these sub-images is used.In general, subimage is called cluster.The target sub-image or the target cluster denoted by Ω T m .The reference sub-image or the reference cluster used to compare with the target cluster denoted with ψ R i .The similarity between two clusters can be performed by comparing the similarity of manifold of two images.The similarity of two clusters can be checked by measuring the distribution of the pixels in these two clusters as follows.The method called Wald-Walfowitz test of randomness which is a two-sample test was adapted for this purpose.This idea is performed by comparing shape and distribution of two relevant clusters.The main concept is to check whether the probability candidate clusters originaly came from the same distribution with target cluster.Let Ω T m ={t 1 ,t 2 ,t 3 ,...,t n1 } be a target cluster, t i is a vector for pixel i in target cluster where some output attribute of t i contains missing value.Let ψ R i ={r 1 ,r 2 ,r 3 ,...,r n2 } be a reference cluster to be compared to checke whether the chance that these two clusters came from the same distribution.r i is a vector for pixel i in a reference cluster.
The test is done by using a run which is defined as a consecutive sequence of identical labels.The number of runs is used as a statistical test.The distribution similarity measure between two clusters are calculated by applying Wald-Walfowitz test for multi-dimensional data sets as follows.

1) Merge two clusters into one cluster denoted by matrix
A. 2) Calculate the minimum spanning tree between two clusters in matrix A. 3) From the minimum spanning tree in step 2, calculate the statistical test by using a number of connected between two different groups in matrix A denoted with R. If there is the connected graph between two different groups then increase R by one.Otherwise, if the connection is in the same group of cluster, do nothing.Calculate R until all paths are already computed.Finally, increased R by one.4) Calculate the statistical test value of R, as follows.
5) Calculate p-value of W as same as calculating Zstatistics.6) Compare the 95% confidence interval between pvalue of W and α with the following hypothesis: a) H 0 : two clusters come from same distribution.b) H 1 : two clusters come from different distribution.With the above hypothesis, if p-value of W greater than α, then the null hypothesis H 0 are accepted.Otherwise reject the null hypothesis.7) Add the reference cluster which is already accepted by the test statistics, and is from the same distribution with target cluster into the similarity clustering list(C list ).
After the most similar cluster with target cluster is selected, this reference cluster is used to impute the missing values.

III. EXPERIMENTAL SETUP
The following data, parameters, and comparison results were considered in the experiments.A tested data with different characteristics of damaged image were considered.In this paper, non-randomly missing pixels were studied.

A. Selection of algorithms
The following algorithms were used to impute missing values in the damaged area. 1) Criminisi's algorithms [7].

B. Data set descriptions
All data sets used in the experiments have difference data density, number of data set and characteristics of data set.Three damaged standard image data sets were used: Lena image, the two circles image, the window image [13].The characteristic of these missing image were shown in Fig. 2(a), Fig. 3(a), Fig. 4(a).

C. Performance of algorithms
The performance evaluation of the proposed algorithms was compared with the other algorithms by using Mean Absolute Percentage Error(MAPE) and Peak Signal-to-Noise Ratio(PSNR).The MAPE compute from equation (7).
where pi is an imputed pixel intensity from each algorithm, p i is original pixel intensity, n is the number of missing data.
After that the accuracy of imputed image is calculated from equation( 8), To evaluate the performance of the proposed imputation algorithms, PSNR (Peak Signal-to-Noise Ratio) were used as an indicator for measuring.The Peak Signal-to-Noise Ratio (PSNR) is defined by where m and n are the width and height of the image, p i and pi are the intensity values of the original image and of the restored image respectively.Typical values for the PSNR are between 30 and 50 dB, where higher is better.

A. Experimental results
The experimental results by using the proposed algorithms and the traditional method are presented in this section.
The comparison of the imputed two-circles image between the competitive algorithms and the proposed algorithms are shown in Fig. 2. The missing image, the imputed image with proposed algorithms, the imputed image with Criminisri algorithms, the imputed image with Huan algorithms are shown in Fig. 2(a) -Fig.2(d).
The comparison of the imputed Windows image between the competitive algorithms and the proposed algorithms are shown in Fig. 3.The missing image, the imputed image with the proposed algorithms, the imputed image with Criminisri algorithms, the imputed image with Huan algorithms are shown in Fig. 3(a) -Fig.3(d).The difficulty of imputation processes in this image is that how to restore the shape inside each missing area, for example, the straight line, the shallow from outside.If the conventional method was used, then the pattern inside missing area can not be restored as shown in In the other hand, by using the proposed method which uses the ordering calculated from marching method before using neural network and comparing the similarity between the observed area and the missing area for re-imputing the missing area, this technique can restore the complex structure inside an image as shown in Fig. 3(b).
The comparison of imputed Lena image between the competitive algorithms and the proposed algorithm are shown in Fig. 4. In the original image with the missing image , the imputed image with the proposed algorithms, the imputed image with Criminisri algorithms, the imputed image with Huan algorithms are shown in Fig. 4(a) -Fig.4(e).The experimental results showed that when impute the missing area in an image, for restoring the shape of original image, the order of missing pixels to be imputed was considered.The experimental results showed that the proposed algorithm can give a high accuracy compare to other competitive method.The proposed algorithms can restore the surface structure in an eye as shown in Fig. 4.
From above, the experimental results showed the visualize quality by using the proposed method and the comparing method.The following will show the accuracy of algorithm by using the PSNR and the accuracy measure.The PSNR in each image pattern is shown in Table I.The accuracy of the restored image in each pattern is shown in Table II.The experimental results showed that the proposed algorithms give a high PSNR in serveral case of missing patterns and several images.

B. Discussions
The experimental results showed that, the proposed algorithms can be used to impute the missing pixels and can be used to reconstruct the damaged image.The performance of reconstructing image gives a high value of PSNR as shown in Table I.An apparent problem on the proposed method as well as any method is the edge effect problem.If a missing pixels locates in the edge of regions of available data, it may give a wrong imputed value.This problem may cause to the low PSNR.This problem will be solved in the future.Although the proposed algorithms can give a satisfied accuracy in an imputed image, some problem was occurred which affects to the performance of the restoration image that is the size of window of missing pixels.If there are gaps that are too large the accuracy could not be reached.Moreover, the method for selecting the nearest neighbors is an other problem which will be concerned in the future.V. CONCLUSIONS This research focused on the imputation technique for damaged image.The proposed ideas were based on using a mask template and training a remaining pixels for reconstructing damaged areas by using a neural network.We ensured the accuracy of the imputed image by using the similarity comparison technique which is Ward-Walfowitz test of randomness.The experimental results showed that the proposed method give high accuracy of imputed data in several of data set.

Fig. 2 .
Fig. 2. The comparison of imputed Two circles image between the competitive algorithms and proposed algorithms.(a) The missing image (b) Imputed image with proposed algorithms (c) Imputed image with Criminisri algorithms (d) Imputed image with Huan algorithms

Fig. 3 .
Fig. 3.The comparison of imputed Window image between the competitive algorithms and proposed algorithms.(a) The missing image (b) Imputed image with proposed algorithms (c) Imputed image with Criminisri algorithms (d) Imputed image with Huan algorithms

Fig. 4 .
Fig. 4. The comparison of imputed Lena image between the competitive algorithms and proposed algorithms.(a) The original image (b) The missing image (c) Imputed image with proposed algorithms (d) Imputed image with Criminisri algorithms (e) Imputed image with Huan algorithms

TABLE I .
THE PSNR IN EACH IMAGE PATTERN.

TABLE II .
THE ACCURACY OF RESTORATION IMAGES.