Implementation of deep-learning-based edge computing for maritime vehicle classification

In recent years, Artificial Intelligence (AI) has revolutionized almost every field – from automotive industry to electrical industry, continuously affecting the dynamics of society and the lives of consumers. Among the countless applications of AI, AI at the edge provides the ideal solution for specific workload acceleration. Despite the development of various techniques for surveillance in maritime scenarios, automatic maritime vehicle classification of visual surveillance remains a challenge owing to the small dataset, as well as complex, unconstrained, and diverse nature of such scenarios. To date, only a few studies have investigated edge computing in maritime vehicle classification with edge device. In this paper, data augmentation is applied to deal with the small dataset problem because of a lack of images of military ships. Furthermore, the implementation of deep-learning-based edge computing in maritime vehicle classification through the use of NVIDIA Jetson Nano is proposed.


Introduction
The "edge" in edge computing refers to terminal equipment such as mobile communication base stations and environmental monitoring sensors. The opposite term is the cloud server located which is in the "center". Shifting calculations from the center to the edge eases the load on the cloud servers and reduces data processing latency. According to Gartner's periodic report on the development of emerging technologies in August 2019 (Hype Cycle for Emerging Technologies, 2019 (1) , it would take about two to five years for edge AI to "reach a plateau of productivity". Therefore, the future is very bright for edge AI, because of its versatility of applications in almost every type of electronic device, from driverless cars "seeing" pedestrians on the road to coffee machines capable of receiving and responding to voice commands. Applications that require any combination of low latency, data privacy, low power combination or low cost would eventually be moved to the edge for AI inference.
For example, to conduct smart retail analysis using a surveillance camera to study consumer purchasing habits, under the traditional computing structure, it would be necessary to upload high-quality videos shot by the surveillance camera to a cloud server for image recognition and analysis, and large amounts of data would need to be transmitted. However, if image recognition was completed beside the surveillance camera first, and then passed on only the analyzed data (the amount of data transmitted per hour may decrease from several GB to several KB), the overall calculation and transmission load could be greatly reduced. For example, the Jetson Nano, launched by NVIDIA in March 2019, is an AI computing platform created specifically for such situations. It has ample data throughput and computing power, enough to cope with 8-channel Full HD image input and real-time analysis, and supports many types of deep learning, image recognition, and sensor Software Development Kits (SDKs), allowing developers to quickly transfer platforms, making Jetson Nano one of the world's leading edge devices currently.
Seawater accounts for more than 70% of the earth's surface, and human beings have relied heavily on the marine environment and resources from ancient times to the present. In addition, international trade activities rely heavily on marine transportation. Not only are marine activities extremely frequent, but they are also an important lifeline for the economy of countries. It is worth noting that Navy warships often encounter various types of vessels at sea, and if any unknown military vessels appear within the patrol range, Target Acquisition must be performed for those unknown vessels. For instance, radar or personnel observation is used to obtain the position of unknown vessel and its related information, and thereafter an optical sensor is used to capture the image of target. The intelligence officer/personnel on the military vessel can further determine the type of the unknown target. However, it is not easy to train an intelligence officer. These intelligence personnel have to undergo professional training and gather great deal of experience to be good at their job. Besides, there are often very few intelligence personnel assigned to a warship, and if they were to be injured in conflict, intelligence analysis is directly affected. In addition, considerable time and labor cost is incurred by using people to identify military vessels, and in a modern battlefield where every minute counts, this can directly affect the fate of the battle. Hence, identification of the exact type and model of the vessel is crucial in military or rescue scenarios and surveillance (2,3) .
Electro-optical (EO) sensors are low cost, small-sized, less power consuming, and suitable for most vehicles, among its other characteristics. Besides being one of the important sensors in Maritime Security Surveillance, they also play an important role in the future development of UAV, USV, and Autonomous Ships in Preventing Collisions and Automatic Visual Guidance (2,4) . Therefore, the EO sensor is being heavily used as an important source of data to complement traditional radar and ranging equipment (2,(4)(5)(6)(7)8) , from traditional monitoring systems to the unmanned ships that are currently being actively developed.
In addition to the EO sensor's advantages such as providing intuitive visual information to personnel, image processing, computer visualization can also be applied to the abundant video data collected by it and be further combined with AI technology to achieve automatic calculation and active detection. This improves the autonomous Situational Awareness at sea (9) and provides early warning and threat assessment of military vessels. Thus, if automatic classification and identification of military vessels could be done automatically on the edge device, the perception of autonomous vessels would be improved.
In this study, a maritime vehicle classification device has been developed using CNN technology in AI and edge computing. However, it is worth noting that military photographs are often difficult to obtain (7) . Consequently, in this study, the initial data collection was carried out by searching for relevant images on the Internet, which was then further combined with Data Augmentation, leading to the creation of a maritime vehicle classification dataset to train deep learning models. Finally, the edge device was implemented to classify different maritime vessels. It is hoped that this device could be used to decrease time and labor cost, and increase combat effectiveness.

Method
The flowchart of the proposed method is shown in Fig.  1. First, from online image search engine (Google Images), the maritime classification dataset was collected for training and testing the deep learning model for ship classification especially military ships. Then, the operation graphical user interface (GUI) and deep learning model was coded using PyTorch library. In addition, data augmentation was applied to increase the diversity and quantity of images. Finally, the ship classification edge device was implemented on NVIDIA Jetson Nano, the smallest yet the most powerful GPU-based edge computing device, for validating the proposed method. The implementation flowchart is illustrated in Fig. 2. First, the maritime vehicle classification dataset is collected from online image search engines (Google Images) and data augmentation for the training model, as illustrated in Fig. 2(a). Secondly, the code is programmed in Jupyter Notebook, as illustrated in Fig. 2 Fig. 2(c), a model is deployed to Jetson Nano, which can be combined with the camera.

Images Collection and Data Augmentation
In   Further, Data Augmentation was used to perform rotation on the aforementioned original collected image dataset as well as Geometric Transform such as: Resize, Rotate, Shearing, Zooming, and Horizontal Flip.
Augmented image examples are shown in Fig. 4. The training dataset consisting of the three categories of Military Ship, Cargo, and Fishing Boat was doubled to 600 images per category, as shown in Table 1, resulting in a training dataset with 1,800 images. The testing dataset remained unchanged at 100 images. This amounted to 1,900 total images in the maritime vehicle classification dataset for this study. At the pre-processing step, each image was normalized to 224×224 resolution.

Training and testing deep learning model
Convolutional Neural Networks (CNN) are considered state-of-the-art model in image recognition. The well-known AlexNet (10) architecture of CNN implementation was applied to train the deep learning model for image classification. AlexNet contains eight layers with weights; the first five are convolutional and the remaining three are fully connected as depicted in Figure 5. AlexNet has previously participated in ImageNet Contest with good results (10) .

Fig 5. AlexNet architecture
In the training phase, this study is divided into two parts for discussion. The first part involves the two categories of Japanese and Korean capital battleships within the military ship classification. The second part involves the three categories of Military Ship, Cargo, and Fishing Boat within the maritime vessel classification.
First, 200 original training images were used for the two categories of capital battleships in the training model (Japan and South Korea) and the data was increased tenfold to 2,000 training images for classification model training of the two categories as shown in Table 2 Fig. 6(a), the training accuracy increases over time, whereas the training loss keeps decreasing until it reaches approximately 0. The training loss reaches its minimum after 10 epochs and then stalls. Thus, we stop the training process, and select it as training classification model. Notably, the training accuracy and training loss are not smooth due to small dataset.
In contrast, with 2,000 augmented training images in Fig. 6(b), the training accuracy increases linearly over time, until it reaches approximately 100 %, whereas the training loss keeps decreasing linearly until it reaches approximately 0. The training loss reaches its minimum after 10 epochs and then stalls. Thus, we stop the training process, and select it as training classification model.
Next, 900 training images were used in the training model for the three categories (Military Ship, Cargo, and Fishing Boat) in the maritime vessel classification, and the data was doubled to 1,800 training images for classification model training of the three categories, as shown in Table 1. Training accuracy and training loss are shown in Fig 7. (a) (b) Fig. 7: Training accuracy and training loss of maritime vehicles with (a) 900 original images for training model; (b) a total of 1,800 augmented images for training model In the training process, as shown in Fig. 6, the training accuracy increases linearly over time, until it reaches approximately 100 %, whereas the training loss keeps decreasing linearly until it reaches approximately 0. The training loss reaches its minimum after 10 epochs and then stalls. Thus, we stop the training process, and select it as training classification model.

Implementing ship classification device
NVIDIA Jetson Nano kit is a small artificial intelligence computer with high performance and energy efficiency. It can execute modern AI workload, run multiple parallel Artificial Neural Networks, and simultaneously process data from multiple high-resolution sensors. In this study, the NVIDIA Jetson Nano device was chosen because of its cost effectiveness and ability to complete basic AI deep learning calculations. Currently, it is one of the world's leading edge devices. As shown in Fig. 8, The hardware used in the proposed drowning prevention device is NVIDIA Jetson Nano, which consists of 4x USB 3.0, USB 2.0 Micro-B, MIPI CSI-2 DPHY lanes, and HDMI 2.0 and eDP 1.4 display unit. It is one of the most powerful edge devices available in the market with 128 CUDA core and 472 GFLOPS-capable GPU. It enables the development of small, low-power artificial intelligence (AI) systems.
In the maritime vehicle classification edge device, the Raspberry Pi Camera was mounted, whose model product was Pi NoIR Camera V2, on NVIDIA Jetson Nano. Further, the training deep learning model was deployed in Jetson Nano and evaluated in a simulation environment. In addition, the operation graphical user interface (GUI) and deep learning model was coded using PyTorch library.

Experiment Analysis
Testing was done on the training of the above two-category military ship classification model and three-category maritime vehicle classification model. The data analysis results are shown in Table 3.  The target images were read by the device in an actual environment for classification, and the experimental results are shown in Fig. 10. Experimental results show that there was no big difference in accuracy for the data analysis done after data augmentation, and classification efficiency for light, shadow and scale after augmentation could be the same in using the proposed method and device. Hence, maritime vehicles could still be effectively classified in a real-world simulation of the dimly lit combat intelligence center of a military ship, and the proposed method could improve the usability of the ship classification device.

Conclusions
In this study, a deep-learning-based edge computing device was implemented for classifying maritime vehicles. In addition, it was shown that edge computing with AI technology can be realistically applied in our lives. A maritime vehicle classification dataset was created using publicly available Internet images and Data Augmentation was used to train deep learning classification models. Finally, the edge device was implemented to classify different maritime vessels.
Experimental results show that the maritime vehicle classification device implemented in this study could not only classify Japanese and South Korean capital warships, but also distinguish among the common maritime vessel types at sea -Military Ship, Cargo, and Fishing Boat. In terms of future research, it is hoped that by increasing the number of original images and further optimizing the model architecture and parameters, stronger accuracy could be achieved and the practical application of the proposed method and device at sea could be strengthened.