Spatial resolution improvement of thermal infrared images by learning of real patch pairs

We propose a method that improves the spatial resolution of infrared images by exampled-based learning using the patch pairs of real data of lowand highresolution images. When applying a super-resolution method to a color image, one must assume an image degradation model. In conventional super-resolution techniques for color images, a common degradation model is down-sampling with bicubic interpolation. However, as infrared and color images have different physical characteristics, a color image degradation model is unsuitable for super-resolution of infrared images. That is, degradation model for infrared images is difficult to express by an explicit function. The proposed method first calibrates the positions of the raw infrared images obtained by highand lowresolution thermal cameras. It then creates a dictionary of patch pairs at corresponding locations in the lowand high-resolution images, and computes the super-resolution using the sparse modeling principle. When experimentally tested on actual data, our method more accurately determined the temperature distribution than a method using a degradation function model as the down-sampling representation.


Introduction
Thermal cameras, which visualize the temperature distributions of target scenes, can reveal the spatial details of temperature, internal abnormalities in machines, and the presence of humans in dark environments (1)(2) . Recently, research has been conducted on a pedestrian separation algorithm using an infrared image for car-driving assistance (3) . Resolution is important when examining the temperature distribution of an object using an infrared camera. The temperature distribution of an infrared camera with high spatial and temperature resolution can accurately outline the shape and properties of the target scenes and objects, but is expensive. The present study aims to improve the resolution of low-resolution infrared images taken from inexpensive infrared cameras.

Super-resolution technique 2.1 Methods of increasing the resolution of the image
The spatial resolution of images can be improved by three types of super-resolution (SR) methods: interpolation based SR, reconstruction based SR and example-based SR. Interpolation-based methods restore the high frequency components by linear interpolation, which is suitable for smooth edges and homogeneous regions but not for sharp edges. To overcome this limitation, reconstruction-based methods impose the likelihoods of high resolution images as the regularization conditions, and therefore recover sharper edges than interpolation-based methods (4)(5) . Example-based methods are also suitable for images with sharp edges, as they recover the high frequency components on each patch (rather than the entire image) using external data (6)(7)(8)(9)(10)(11)(12)(13) .
In contrast, the interpolation-and reconstruction-based methods use only the input image. The external data of example-based methods are a set of low and high resolution image pairs. First, the relationship between the low and high resolution images (called the dictionary) is learned by a training method. Next, the high resolution images are recovered by consulting the trained dictionary. These methods function well when the trained dictionary contains sufficient information for the image recovery, but may introduce unexpected high frequency components.

Exampled-based super-resolution
Example-based methods mainly consist of a learning process and a reconstruction process. The learning process creates the dictionary that learns the correspondence between each patch in the low-and high-resolution images. The reconstruction process improves the resolution of the input image by referring to the created dictionary. The advantage of this method is that high frequency components not present in the input image can be restored. Further, since a high-resolution image is used, a clear image can be output even when the magnification is high. The disadvantage of this method is that the output result depends on the amount of learning and the type of learning image. One method considering the disadvantages of learning-type super-resolution is the method of Yang (14) . The learning process of this method obtains a basis that abstracts the image features from a learning image obtained by a deterioration model using bicubic interpolation, and creates a dictionary composed of pairs of low-resolution and high-resolution bases. The reconstruction process improves the resolution of the input image based on sparse modeling with the created dictionary.

Features of infrared image
A low-and high-resolution infrared image of the same scene are presented in panels Fig. 1(a) and (b) respectively. The smooth temperature change in the low-resolution image is attributable to the ambient temperature at the object boundary. In contrast, the temperature change in the high-resolution image naturally follows the shape and properties of the object, and is abrupt at the object boundary. Therefore, to accurately visualize the temperature distribution of the object and scene, the influence of the resolution difference is indispensable. Figure 1 (c) shows the temperature profiles across the subject's head imaged in panels (a) and (b). Here, the temperature was degraded by down-sampling and restored by bicubic interpolation. The profiles of the degraded image and the low-resolution image do not approximately coincide, confirming that the degradation model for color images is inapplicable to infrared images.

Super-resolution for infrared images
The physical properties of infrared images differ from those of visible color images. Therefore, the color image resolution technique is unsuitable for processing infrared images. Kouda (15) improved the resolution of infrared images using the displacement between multiple low-resolution infrared images. This method requires multiple infrared images and their corresponding color images for input, which is unsuitable when only one infrared image is available.

Problems caused when Yang's method apply for infrared image
Yang employed a degradation model that down-samples the low-resolution color images by bicubic interpolation of the high-resolution images. The super-resolution image obtained by super-resolution of Fig. 1 (a) using Yang's method is shown in panel Fig. 2(a). Figure 2 (b) shows the temperature profiles across the subject's head imaged in panels Fig. 1(b) and Fig. 2(a). The results show that the super-resolution image cannot represent the temperature of the raw high-resolution image. Therefore, Yang's method is not suitable for increasing the resolution of infrared images. The reason may be that the high-resolution infrared camera and the low-resolution infrared camera have different spatial resolution and temperature resolution. Although a high temperature resolution is required in order to capture the temperature of the object in more detail, the temperature resolution is not taken into account in the degradation process of the color image. In the degradation process of the infrared image, it is necessary to consider not only the spatial resolution but also the temperature resolution.

Method
The purpose of this method is to improve the resolution by super-resolution considering the deterioration process of the infrared image. First, using the image obtained by a low-resolution infrared camera and a high-resolution infrared camera, we create a dictionary based on the deterioration process of the infrared image. Then, the resolution of the input image is improved by super-resolution based on sparse modeling using the created dictionary.

Infrared image position correction by alignment using affine transformation
The position shift between the infrared image and color image obtained in one shot is corrected by an affine transformation using a color image and an infrared image captured by an infrared camera. As an output, a position-corrected infrared image is obtained. Next, we correct the misalignment caused by the different shooting positions of the high-and low-resolution cameras. The corresponding points in the affine transformation are obtained on the color image manually performed or using the speeded up robust feature detector. As final output, pair of position-corrected low-resolution and high-resolution infrared image is obtained. All affine transformations performed on the color image is also performed on the infrared image.

Creating a patch pair dictionary based on real data
The position-corrected low-and high-resolution infrared images are divided into patch sizes, from which the patch-pair dictionary is created. Figure 3 shows the low-resolution patches used in the created dictionary and their corresponding high-resolution patches.

Thermal super-resolution processing
The proposed SR method for infrared image improves the resolution of the input image based on the sparse modeling of Yang with the created dictionary. The SR process is demonstrated in Figure 4. The input image patch PL is expressed as a linear sum of k low-resolution patches as follows: (1) Here, DL is a low-resolution patch selected from the dictionary, and w represents its weight. k is determined by the number of sparse solutions. The estimated highresolution patch PH is estimated as a linear sum of the highresolution patches corresponding to the low resolution patches: The appropriate weight w of the selected low-resolution patch is determined as a sparse solution using the Lagrange multiplier: where x is the input image patch. Performing this process over the entire image, a high-resolution image is obtained.

Experimental conditions
The proposed method was competed conventional method based on degradation process of color image and interpolation method. The output images of three methods were compared, confirming the effectiveness of the proposed method. In addition, to confirm the accuracy of the resolution improvement, we prepared a reference image containing the same object as the input image, but photographed by a high-resolution camera. In this experiment, input image was used an image obtained under the condition that the distance between the camera and the object is 3 m. Infrared cameras were a FLIR C2 (320 × 240 image pixels) and a FLIR T1050 (1024 × 768 image pixels). The learned dictionary size was 512, and the low-and high-resolution patches were sized (5 × 5) pixels and (16 × 16) pixels, respectively. Figure 5 show the experimental results. Whereas the conventional method could not accurately express the temperature change around the edge, the proposed method obtained a smooth temperature change. In addition to the average change with ambient temperature, the proposed method obtained the temperature change along the shape (as evidenced in the temperature line profile in Figure 5). The temperature of the nose was well-clarified in the raw high-resolution image, and was lower than in the input image. In contrast, the bicubic interpolation and conventional methods failed to capture the nasal feature.

Results and discussion
The proposed method expressed the temperature along the raw high-resolution image. In this experiment, the learned dictionary was formed from six pairs of raw low-and high-resolution images. When a sufficient number of paired images resides in the dictionary, the results can be independent of the type of input image.

Conclusions
We have proposed a method that improves the spatial resolution of thermal infrared images. The method uses the patched pairs of actual image data for learning the dictionary of an example-based SR. The position of the raw infrared image is corrected by an alignment process, and the dictionary includes the learned patch pairs of images at corresponding locations. In super-resolution processing, according to the super-resolution based on the principle of sparse modeling using the created dictionary, the resolution of the input image is improved. The proposed method and the conventional method were tested on raw infrared images captured by a low-resolution infrared camera. The experimental results confirmed that the proposed method improves the spatial resolution of the infrared image, and captures the temperature change in the actual data. Therefore, it can represent the detailed temperature changes in target scenes.