Developing a Video Synchronization Framework for Wind Turbine Inspection Videos

Robotic platforms have been continuously and rapidly advancing, especially over the last two decades. Apart from their usage in automating repetitive duties in a controlled environment, they have also been employed for exploration, mapping, inspection, monitoring purposes, particularly in areas that cannot be easily accessible to humans (e.g., underwater, space and many others). One of the leading research directions in such robotic fields is Unmanned Aerial Vehicles (UAVs). Visual inspection is one of the most important and automated application areas in which such robotic platforms have been often used in open areas for inspection, monitoring, mapping, and similar applications with a certain distance to the area of interest. Wind turbines are one of the widely used renewable energy sources and their sustainable operations highly depend on periodical inspection and maintenance in which UAVs have been actively deployed to collect optical data. The optical data gathered by UAVs need to be processed for visual inspection of wind turbine blades. In this paper, we discuss a methodology for identifying images in which blades are near the same position obtained at a different time on the way to make a visual inspection and detecting temporal changes process in a fully automatic manner. We present preliminary experimental results obtained on synthetically generated data.


Introduction
With the recent developments of camera-carrying robotic platforms, gathering optical data from areas beyond human reach or areas not easy access to humans have become available. Visual inspection is one of the most important and automated application areas in which such robotic platforms have been often employed. While the existing approaches are able to cope with mapping/inspecting static areas or surfaces (e.g., infrastructure (1) , tunnel (2) , pipeline (3,4) , hull (5,6) , dam (7,8) , bridge (9,10) and many others), there is an increasing demand for inspecting areas/parts moving constantly. This has been recently included in the context of dynamic scene mapping (11,12) .
Wind turbines have been one of the leading renewable energy sources converting the kinetic energy of wind to electrical energy. According to the statistics announced by Word Wind Energy Association (WWEA), Wind Power Capacity Worldwide has reached 597 GW covering approximately 6% of the global electricity demand 1 . Their inspection and maintenance are essential factors for their service in an efficient and long-lasting manner (13,14) . Their inspection is labor-intensive, and they offer dangerous environments for humans to carry out inspection taking into account their size. Their inspection may take a long time, and this leads to also long time without operation causing energy loss. Unmanned Aerial Vehicles (UAVs) have been shown as a useful tool for making the inspection process easier, faster and safer for the visual inspection of surfaces (14) . Most of the existing recent studies focus on developing methods for detecting cracks, debris and/or imperfections on blade images using machine learning/deep learning methods (13,15,16) . Although these methods are efficient, they require a relatively big training dataset, computational resource and a certain amount of expertise even using pre-trained models for testing. Also, their performance relies on training data quality. Inspection systems usually are composed of two main steps, namely, mapping and automatic comparison, to detect changes over time. The mapping part requires efficient methods to synchronize the videos acquired by a UAV from different viewpoints considering the dynamic scene. Existing methods have an assumption of a synchronized multi-camera setting (e.g., surveillance) (12) .
In this paper, it is aimed to discuss an end-to-end combination of image processing methods to synchronize video files for visual inspection of wind turbines automatically while they are operating using only image information obtained by low-cost UAV. In our context, the term synchronization is referred to as identifying frames among video file(s) in which wind turbine blades are in the same angular composition with respect to the camera. Our proposal takes a video file in which the wind turbine is imagined nearly fronto-parallel to the camera as an input and extracts angular information of blades to be used later in order to track the changes on blades obtained through videos captured in a different time. Unlike most of the aforementioned existing methods, our framework does not make use of any method requiring pre-training and/or learning, which needs a certain amount of well-defined training data and computational power. Therefore, the proposed framework has a low computational cost and does not use any other sensor information apart from the optical data allowing it to be used with low-cost UAVs, e.g., only flying cameras.

Video Synchronization Framework
Our framework assumes video files in which the camera is almost fronto-parallel to the rotating wind turbine, and the camera undergoes small motions due to the stabilization and environmental effects. The proposed framework makes use of image-only information to find similar positions of blades of a wind turbine in the images from video files obtained at different times. The overview of the proposed framework is summarized in Algorithm in 1. In order to eliminate the background, We adopted a well-known image stabilization through image alignment/mosaicking (17,18) method. We extract scale-invariant features from images and compute the transformation between images using RANSAC in order to compute background motion (referred to as approxH in Alg. 1). Since the objects closer to the camera move more than the ones are further, we apply a sampling-based searching mechanism in order to register the center part of the wind turbine correctly. This is a crucial procedure as image subtraction will be applied afterward. We generate a small number of (e.g., 5) different motion matrices by slightly changing the translation part of the approxH and test them to select optimalH with minimum residual. Later, we apply morphological operations to extract regions, mainly blades of the wind turbine. We obtain some important properties of the extracted regions, such as the angle between the x-axis and the major axis of the enclosing ellipse, angle of the bounding box, and angles of min. and max Feret diameters. The proposed framework is motivated by the fact that in the case of two images where blades are in different position extracted properties through the framework given in Algorithm 1 are expected to be similar. An example of the situation mentioned is given in Fig. 2. Image 1 is compared against images 9 and 12 extracted from the same video. After the image subtraction step, morphological properties would remain the same in both cases for Image 1. Once the region properties are found as a result of comparing all-images-against-all, histogram analysis is carried out in order to find representative region properties for each image. During this procedure, images in which blades of the wind turbine has the same configuration can also be identified within the same video file. The same process is repeated for each video file obtained at a different time. Representative region properties for each image in video files are compared. Thus, images are identified among different video files. Since the approach relies and operates on mainly angle information, it is invariant to background and image resolution.

Experimental Results
To test the proposed framework, we created synthetic images using the algorithm outlined in Algorithm 2. We generated three different video files (Merged Video is available at https://streamable.com/sc04y). Example frames from each generated video are given in Fig.1. Each video has a different blade rotation speed, background, and the total number of frames, respectively, 201, 144, and 101. The image resolution for the first video is 742x875 while for the second and the third video files is 1484x1750. Datasets are created taking into account that objects that are closer to the camera move more. Therefore, two different random translations were used; one for the wind turbine and one for the background.
We first test our framework to find similar frames in which blade angles are near identical within the same video file. Finding these pairs provides valuable information, e.g., rotation speed. We compute the histogram and select peak 3 angles in order to choose corresponding blade angles to associate to the frame using the angle information extracted in Algorithm 1. We used these angles for searching the same frames both within the same video file and among the video files. Table 1 presents the number of correct and wrong identified image pairs and the ratio of correct frames within the same video file, while Table 2 is for among the video files for different bin-width during histogram analysis. From the tables, it can be seen that increasing bin-width within the same video file reduced overall performance as the rotation speed of blades is constant within the video file. The total number of identified image pairs among the different video files is relatively low. This is mainly due to the fact that the same angular status for blades is less likely to exist as the rotation speed is different for each video. Also, while increasing bin-width for identifying among different video files improved the results for the first and the second video files, the ratio for the third video reduced. From our experiments, we observed that primary error sources were singularities in angle representation, background elimination and also the cases one of the blades overlaps with the pole either fully or partially. Since the third video file has a relatively more complex background comparing two other tested video files, this background caused imperfections during background elimination. Results presented in the tables were obtained using solely aforementioned extracted information without using any filtering, thresholding, non-maximal suppression among frames, any shape prior (e.g., the angle between blades is the same) and frame rate information. Some of the resulting sample frames are given in Fig. 3.

Conclusions and Future Work
Wind turbines play an essential role in generating electricity from wind, and their importance is more and more as a source of renewable and clean energy generation. They can be regarded as a mechanical system composed of different parts. Their periodic inspection is indispensable for their continuous and sustainable service. UAVs carrying different sensors have been deployed for creating a visual record of wind turbines in order to facilitate the inspection process.  Generally, existing approaches provide tools for data gathering, and the data is further studied by human experts. In this paper, we present an approach to synchronize videos of a wind turbine obtained both while operating and in different time intervals using the image only information. Our approach can be seen as the first step of fully automatic inspection. The proposed approach makes use of the shape/morphological properties of the blades to find a similar configuration in the other video. The approach takes video files in which the wind turbine is nearly fronto-parallel to the camera as input and does not use any other information. We present our preliminary findings through experimental results on synthetically generated realistic data (available at https://streamable.com/sc04y), which takes into account a certain level of vibration motion that might be caused due to environmental conditions (e.g., wind) and/or stabilization of UAV. Our approach is built upon well-established computer vision and image processing algorithms, which have relatively low computational costs and can be run on off-the-shelf computers without requiring extensive experience.
The presented framework could be enhanced by incorporating different information, which can be derived from the video file, e.g., rotation speed, and related others. As future work, we will evaluate our framework with real data. We will also develop methods for detecting temporal changes on blades as images, in which blades have similar orientations are identified, extracted from videos acquired at different times.