Orientation Detection Using a CNN Designed by Transfer Learning of AlexNet

aDepartment of Mechanical Engineering, Faculty of Engineering, Sanyo-Onoda City University Sanyo-Onoda 756-0884, Japan bGraduate School of Natural Science and Technology, Okayama University Okayama 700-8530, Japan cMechanical Engineering Department, School of Sciences and Engineering, The American University in Cairo AUC Avenue, P.O.Box 74, New Cairo 11835, Egypt *Corresponding Author: nagata@rs.socu.ac.jp


Introduction
Artificial neural network (ANN) which has four or more layers structure is called deep NN (DNN) and is recognized as a promising machine learning technique. Convolutional neural network (CNN) has the most used and powerful structure for image recognition. It is also known that support vector machine (SVM) has a superior ability for binary classification in spite of only two layers. Nagi et al. designed max-pooling convolutional neural networks (MPCNN) for vision-based hand gesture recognition (1) . The MPCNN could classify six kinds of gestures with 96% accuracy and allow mobile robots to perform real-time gesture recognition. Weimer et al. also designed a deep CNN architectures for automated feature extraction in industrial inspection process (2) . The CNN automatically generates features from massive amount of training image data and demonstrates excellent defect detection results with low false alarm rates. Faghih-Roohi et al. presented a different type of deep CNN for automatic detection of rail surface defects (3) . It was concluded that the large CNN model performed a better classification result than the small and medium CNN, although the training required a longer time. Zhou et al. used a CNN to classify the surface defects of steel sheets (4) . The CNN could directly learn better representative features from labeled images of surface defects. Further, Ferguson et al. presented a system to identify casting defects in X-ray images based on the Mask Region-based CNN architecture (5,6) . It is reported that the proposed system simultaneously performed defect detection and segmentation on input images making it suitable for a range of defect detection tasks.
We have developed a CNN&SVM design and training tool for defect detection of resin molded articles and the effectiveness and validity have been proved through several CNNs design, training and evaluation (7,8,9) . The tool further enables to easily design a CNN model based on transfer learning concept. When industrial robots are applied to pick and place tasks of resin molded articles, information of each object's position and orientation is essential. Recognition and extraction of object position in an image are not so difficult if using image processing technique, however, that of orientation is not easy due to the variety in shape. In this paper, a CNN acquired by transfer learning of AlexNet, which is the winner of ImageNet LSVRC2012, is introduced to recognize the orientation of objects in images. The effectiveness of the CNN is evaluated using test image data set of thin resin mold articles. Figure 1 shows the main dialogue of the developed CNN&SVM design tool. In training of CNN, pre-training using randomly initialize weights and additional (successive) training with once trained weights can be selected. As for SVM, one-class unsupervised learning and two class supervised learning can be selectively executed. Also, favorite CNN, which is used for a feature extractor, and Kernel function are selected.

Design Tool for CNN and SVM
The tool has another promising function to design original CNNs based on transfer learning. For example, the following main items can be set for the operation of transfer learning through the dialogue.
(1) Folders for training and test images.
(3) Learning parameters such as max epochs, mini batch size, desired accuracy and loss, learning rates for convolution layers and fully connected layers. The software shown in Fig. 1

Images for Training and Test
Training image generator was already proposed to efficiently augment limited number of training images (7) . By     orientation of objects.
After the training, the generalization ability of the transfer learning based CNN is checked using 15 test images imitating resin molded articles which have not been included in the training data set. Figure 7 shows the photos and their classification results, i.e., the angle shown in the JPEG images is the output from the CNN. It is observed from the results that the obtained CNN has a promising ability that can recognize the orientations of objects in the images. However, are observed. As can be clearly seen, some images in Fig. 7 are not square. That is the reason why the main cause of these results seems to be the conversion of resolution before classification. The resolution of images given to the input layer has to be converted to 227×227×3 fixed according to the input layer of the AlexNet, which brings out some undesirable deformation of images and the resultant ambiguities in classification.

Conclusions
In this paper, a CNN acquired by transfer learning of AlexNet, which is the winner of ImageNet LSVRC2012, is introduced to recognize the orientations of objects. The effectiveness and promise of the CNN are evaluated using test images imitating thin resin mold articles. In future work, we are planning to apply a small robot incorporated with this CNN to an actual production line of resin molded articles with various shapes. Because orientation information is essential for the robot to successfully play a pick and place task.