Rice Disease Identification System Using Lightweight MobileNetV2

Rice is one of the main food crops in China, and rice diseases have become an important factor influencing the increase in food production losses in China. Traditional manual identification of rice diseases is time-consuming and labor-intensive. Machine learning algorithms have improved this problem and have been applied to the field of smart agriculture. The convolutional neural network (CNN) in deep learning has a significant effect on rice disease recognition relying on the characteristics of automatically extracting features. Aiming at five major rice diseases such as sheath blight, rice blast, bacterial leaf blight, rice smut and brown spot, this paper proposed a rice disease identification system using lightweight MobileNetV2. The identification results are uploaded and saved to the cloud database. Based on the lightweight model MobileNetV2, the system uses the channel pruning method to further compress the model. Compared with the original model, the memory usage has been reduced by 74%, the number of floating-point operations per second (FLOPS) has been reduced by 49%, the number of parameters has been reduced by 50%, and the accuracy of rice disease identification has increased by 0.16% to 90.84%.


Introduction
Smart agricu lture is an advanced stage of agricultural development with smart production as the core, refined, intelligent, intensive, and scientific production. It is a new model for the develop ment of agricultural modernizat ion, promoting the quality and efficiency of agricultural products. Rice is one of the main food crops in China, with a wide planting area. Affected by environmental conditions and cultivation techniques, rice diseases and insect pests have become more and more serious, the damage has increased, and the loss of rice yield has also increased (1) .
Experts estimate that there are more than 1,600 kinds of diseases, pests, weeds, and rodents that occur all year round in crops in my China. A mong them, there are more than 100 types that can cause serious damage, and 14 million tons of food are lost due to diseases and insect pests each year. In view of these circu mstances, the state has given more and more preferential policies and support in the field of smart agriculture. Smart agricu lture uses and integrates advanced methods such as Internet of Things (IoT) technology and image processing technology to monitor the production environment and growth status of crops intelligently (2) .
Remote monitoring and automatic d isease identification can help prevent rice diseases from occurring on a large scale, laying a foundation for reducing food losses caused by diseases. This is of great significance to our country's goal of stabilizing and increasing grain output and ensuring food security this year. Co mpared with the tradit ional artificial visual recognition of crop diseases and continuous monitoring of crop growth, the application of advanced technologies such as machine learning has obvious advantages. Advanced technology saves a lot of labor and greatly improves agricultural p roduction efficiency. In recent years, convolutional neural network (CNN) has been applied to different scene tasks relying on powerful feature extraction capabilit ies (3) , and has achieved a significant improvement in accuracy. However, conventional CNN often requires a huge amount of floating point operations and occupies a large amount of storage space while achieving satisfactory accuracy. It has higher and higher hardware requirements for terminal equip ment (4) , and relies on the powerful co mputing capabilit ies of GPUs to support the training and inference of network models. Therefore, how to use the redundancy of CNN structure and parameters to co mpress the model, obtain a model with fewer parameters and a more streamlined structure without affecting the automatic identificat ion of rice diseases has become the key to reducing food loss.  (6) . background interference and low detection accuracy (10) . CNN in the deep learn ing algorith m can automatically extract features, but its application in rice disease recognition is relat ively scarce. The current mainstream model co mpression methods are as follo ws: parameter pruning, parameter quantizat ion, low-ran k decomposition, parameter sharing, co mpact network and knowledge distillat ion (11) . Lebedev et al. used the OBD algorith m to treat the convolution operation as a matrix mult iplication calculation, made the convolution kernel sparse in a group manner, and turned it into a sparse matrix mu ltip licat ion, and this method improved the calculation speed (12) .
Courbariau x et al. first proposed binarized neural network (BNN), and it quantized the weight and activation value to ±1 to achieved the purpose of compressing the network through binarization (13) ; Wen et al. proposed Force regularizat ion to coordinate more weight informat ion into a low-rank space to compress the network (14) . Howard et al.
proposed MobileNet, it splits ordinary convolution into depth-wise convolution (DWC) and point-wise convolution (PWC), and reduces the number of mu ltip licat ions to reduce the amount of model calculations (15) . Buciluǎ et al.
proposed knowledge distillation to train a co mpression model of a strong classifier with pseudo-data labels and copy the output of the original classifier (16) . Polino et al.
proposed a quantitative training method that adds knowledge d istillation loss. There are floating-point models and quantized models. The quantized model is used to calculate the forward loss, and the gradient is calculated to update the floating-point model. Before each forward calculation, it updates the quantization model with the updated floating-point model (17) . The quantification of the model to a special bit width requires the design of a special system architecture, and it is not very fle xib le. The low-rank decomposition operations are costly. Knowledge distillat ion, co mpact networks, etc. need to build new small networks, and it is more cu mbersome. Li et al. proposed to calculate the L1 norm of the convolution kernel, filter out the feature map corresponding to the convolution kernel with the s maller L1 norm, and then train the network after pruning (18) . The current pruning methods are mostly performed on heavyweight models such as VGG and ResNet, and very few on MobileNet, ShuffleNet, etc. Aiming at the deficiencies of the above methods, this paper proposes a rice disease recognition system using lightweight CNN MobileNet V2 to prevent the large-scale occurrence of rice diseases. The system adopts the method of channel pruning to fu rther co mpress the model. It reduces the amount of model calcu lations and facilitates lightweight deployment.

Proposed Work
The rice disease identification system proposed in this paper is main ly completed in the follo wing three steps: dataset production, MobileNet V2 model train ing and compression, and model deployment. The specific process is shown in Figure 1.

Rice disease dataset 30
After reading and searching a large number of related agricultural literatures, this paper selects five co mmon rice diseases as the target categories, including three major diseases of rice: sheath blight, rice blast, and bacterial blight, as well as rice smut and brown spot. As we all know, CNN can auto matically extract features. What training CNN really does is to adjust parameters and find the best point with lo wer model loss so that the network can map input (such as images) to output (such as labels). The amount of network parameters is directly proportional to the complexity of the task. Therefore, training CNN requires a large amount of sample data. The larger the number of samples, the better the effect of the trained model and the stronger the generalizat ion ability of the model. Through field shooting and data searching, we found that the image data of rice disease is less. When the dataset is small, too many parameters will fit all the characteristics of the dataset, rather than the commonality between the data. In o rder to prevent the model fro m overfitting, accelerate the convergence speed of the model, and enhance the robustness of the model, this paper first preprocesses rice disease data, including data enhancement operations such as random cropping, random rotation, and horizontal flipping, and then normalizes the entire dataset.

MobileNet model
MobileNet Co mpared with trad itional convolution, when a 3x3 convolution kernel is used, DSC can reduce the amount of calculation by 8 t imes. MobileNet V2 is an improvement based on MobileNet V1. The difference between the two is shown in Figure 4.  PWC is specially used to increase the dimension. MobileNet V2 adds a new PWC befo re DW C to increase the dimension specifically. It defines the dimensionality coefficient t = 6, and then performs convolution and dimensionality reduction (19) to improve this problem.

(a) Model training
Nowadays, CNN is more and more popular, such as AlexNet, VGG, ResNet, etc. Although the use of these networks for recognition tasks is effective, the model has a large amount of parameters, calculat ions, and memo ry footprint, and is not suitable fo r running on mobile terminals and embedded devices. This paper puts the rice disease dataset into the MobileNet V2 network for training and saves the best model. Fine-grained level (such as weight level) sparsity provides the highest flexibility, versatility leads to higher compression ratios, but it usually requires special software or hardware accelerators to quickly infer sparse models. On the contrary, coarse-grained (such as layer level) sparsity does not require special packages to obtain inference acceleration, and it is less flexib le because it needs to trim some comp lete layers. In contrast, channel level sparsity provides a good compro mise between flexib ility and eas e of implementation, so this paper uses channel pruning to perform model compression. This paper uses L1 regularizat ion to achieve sparse connections. L1 regularization refers to adding a norm penalty to the original loss function. The L1 norm is the absolute value of the weight parameter. Sparsity is equivalent to a feature selection fo r the model, leav ing only some of the more impo rtant features. The L1 regularization formula is as follows: Where X is the input data, y is the label, L is the total loss function, E is the experience loss,  is the parameter of the model, and  is the hyperparameter used to adjust the relative contribution of the norm penalty and the experience loss. When predicting or classifying, so many features are obviously difficult to choose. But if the model obtained by substituting these features is a sparse model, it means that only a few features contribute to the model, and most of the features contribute little. At this time, we can only focus on the features with non -zero coefficients. L1 regularizat ion helps to generate a sparse weight mat rix, and it can be used for feature selection. Channel pruning requires designing evaluation criteria for the importance of parameters. The key lies in the selection of the scaling factor. The function of the scaling factor is to select the channel. If batch normalization (BN) layer is not used and a zoo m layer is added after the convolutional layer, the value of the zoom factor has no meaning for evaluating the importance of a channel, because the convolutional layer and the zoom layer are just a linear transformat ion. If a zoom layer is added before the BN layer, the effect of the zoo m layer will be co mpletely masked by the BN layer. If a scaling layer is added after the BN layer, there will be two consecutive scaling factors for each channel. Therefore, this paper directly uses the scaling factor  in the BN layer as a factor to evaluate the contribution of the upper layer output (lower layer input). That is, the smaller  , the less important the corresponding neuron, and it can be cut out (20) .
The BN layer formula is as follows: y is the output of this layer,  and  are the learnable parameters. According to the set channel pruning ratio, calculate the nu mber of channels to be clipped. The zoo m factor is sorted by size to determine which indexed channels of each layer should be pruned.
After cutting the channel with s mall weight, you can finally get a compact CNN model with fewer parameters and small runtime memo ry. The essence of channel pruning is to cut all the input and output connections related to this channel, and some eigenvalues are easy to lose, and this will cause the accuracy of the model to drop to a certain extent. Therefore, this paper puts the rice disease dataset into the pruned model to fine-tune the parameters to improve the accuracy of the model.

Model deployment
The fine-tuned lightweight model will be deployed to a high-performance computer to realize real-time shooting and identification of five major rice diseases. The system interface will display the real-time images taken by the camera and the rice disease identification results, and output the parameter information of the system operation. At the same time, the rice d isease identificat ion results will be uploaded and saved to the cloud database to facilitate researchers to view historical data.

Experiment And Discussion
In order to verify the feasibility of the system, a large number of experiments and evaluations were carried out. The computer processor used in this experiment is Intel core i7-9750H, the graphics card is NVIDIA GTX1660Ti, the memo ry capacity is 16G, the hard d isk capacity is 512G, the operating system and environ ment are Windows10, the framework used is Pytorch, and the writ ing language is Python.

Model training
After data enhancement, normalizat ion and other preprocessing operations, this paper obtained 12,612 sample images, and all image data were labeled. Figure 5 shows the established rice disease dataset and some samples. In this paper, 30% of the dataset is divided into the test set, and the remaining 70% is the training set. Put it into MobileNet V2 to train the original model and save the best model obtained through training. The test accuracy rate reaches 90.68%.

Channel pruning
In this paper, the channel pruning ratio is gradually increased from 0.1 to 0.9 with an interval of 0.1. At different ratios, the accuracy of the model with only pruning and the accuracy of the model with fine-tuning after pruning are shown in Figure 6.
As the proportion of channel pruning continues to increase, the accuracy of the pruned MobileNet V2 model continues to decrease. Starting fro m 0.5, the decreasing proportion tends to increase. It can be seen from the figure that when the pruning ratio is 0.1 to 0.5, after fine-tuning, the accuracy of the model has increased significantly, and the accuracy of the best model even exceeds 0.   too large, the accuracy of the model is not greatly improved after fine-tuning. This is because there are too many missing features and the damage to the model cannot be recovered. After experiments, this paper chooses the pruning ratio to be 0.5, and prunes the channel with a small scaling factor. The number of channels in each layer of MobileNet V2 before and after pruning is shown in Figure 7.
After finishing the pruning, fine-tune the model and save the best model. Table 1 co mpares the memory footprint, floating point operations per second (FLOPS) and parameter amount of the MobileNet V2 model before and after channel pruning. It can be seen that the channel pruning reduces the memo ry usage of the model by 74%, the FLOPS reduces by 49%, and the parameter amount is reduced by 50%. A model with fewer parameters and a more streamlined structure is obtained, and it has a certain pro motion effect on the mobile terminal deployment of the MobileNetV2 model.

System Architecture
In this paper, the lightweight MobileNet V2 model is deployed on a computer with NVIDIA GTX1660Ti graphics card, and a rice d isease identification system interface is designed. As shown in Figure 8, the image taken by the camera and the rice d isease identification result can be displayed in real time, and the parameter informat ion of the system operation, including the current frame rate, image size, running time, file saving path, etc., will also be output on the interface.
As shown in Figure 9, this paper uses SQL Server to design a database to save the rice disease identification results. The database saves the running time of the system, the names of rice disease images and the recognition results, and it is convenient for researchers to view and analyze historical data.

Conclusions
This paper proposes a rice disease recognition system using lightweight MobileNet V2. A dataset containing five main rice diseases is produced, and the MobileNet V2

Future Work
This paper realizes the lightweight of MobileNet V2, and uses this model to identify rice diseases. It has a positive effect on the mobile terminal deploy ment of MobileNet V2 and the prevention of the large-scale occurrence of rice diseases. The next step will be to supplement and imp rove the rice disease dataset and continue to improve the accuracy of the model. In addition, other model comp ression methods will be co mbined to further streamline the model without affecting the complet ion of the task. Speed up the operation of the model and deploy it to the mobile terminal for optimization.