A Fruit Sensing and Classification System by Fractional Fourier Entropy and Improved Hybrid Genetic Algorithm

It remains a challenge to classify different categories of fruits because of the similarities of shape, color, and texture among them. We presented a novel approach in order to classify fruits accurately and efficiently based on computer vision techniques. We obtained the coefficients using fractional Fourier transform. The entropies extracted from the coefficients were fed into the classifier as the features. A multilayer perceptron optimized by an improved hybrid genetic algorithm was used as the classifier. The experiment results on 1653 fruit images demonstrated that the proposed method achieved an overall accuracy of 89.59%, which was superior to the state-of-art approaches. Our method is efficientive in identifying fruit categories.


Introduction
Automatic fruit classification can help in factory production, supermarket selling, fruit-picking robot, etc.Nevertheless, there is no practical method for automatic fruit classification.Pennington (2009) [1] employed clustering algorithm to classify fruits and vegetables.Pholpho (2011) [2] utilized visible spectroscopy to classify non-bruised and bruised longan fruits.The classification models combined the principal component analysis (PCA), partial least square discriminant analysis and soft independent modeling of class analogy.Yang (2012) [3] applied multispectral imaging analysis to the blueberry yield estimation system.Wu (2012) [4] suggested the max-wins-voting SVM with Gaussian RBF kernel to classify different categories of fruits, with overall accuracy of 88.2%.Feng (2013) [5] utilized Raman spectroscopy which was a rapid and non-destructive tool, and chose a polynomial fitting for baseline correction.Afterwards, PCA and hierarchical cluster analysis (HCA) were employed to classify eight different citrus fruits.Cano Marchal (2013) [6] established an expert system on the strength of computer vision to estimate the content of impurities in olive oil samples.Breijo (2013) [7] used an odor sampling system (electronic nose) for classification of the aroma of Diospyros kaki.Fan (2013) [8] used an artificial neural network to predict the texture characteristics from extrusion food surface images.Omid (2013) [9] developed an intelligent system based on combined fuzzy logic and machine vision techniques for grading of egg using parameters as defects and size of eggs.Ji (2014) [10] used a fitness-scaling chaotic ABC (FSCABC) algorithm to develop an automatic fruit classification system, which can identify 18 kinds of fruits.Ji (2015) [11] presented a novel fruit classification system based on wavelet entropy.
The aim of this study was to present a new approach for fruit classification.We extracted the fractional Fourier entropy (FRFE) from the fruit images to form the feature vector, which greatly reduced the feature space.An improved hybrid genetic algorithm was employed to optimize the multilayer perceptron (MLP), which was used as the classifier.With this develop method, we achieved better results than existing methods.
The structure of the remainder is organized as follows: Section 2 is about the materials used in the experiment.Section 3 describes the fractional Fourier entropy.Section 4 presents the multilayer perceptron classifier.Section 5 shows the improved genetic algorithm, which is used to obtain the optimal parameters of the MLP.Section 6 discusses the results and the contribution of this study.Final Section 7 is devoted to conclusions.

Materials
The "fruit" dataset was acquired by both 6 months of on-site collecting via digital camera (See Figure 1) and search engine (Google).The 18 types of fruits and their numbers can be found in literature [12].

Fractional Fourier Entropy
The fractional Fourier entropy (FRFE) is one of the global features of an image, proposed by Cattani (2016) [13].It is obtained by performing 2-D fractional Fourier transform (FRFT) over an image and extracting the Shannon entropy of the coefficient [14][15][16].
For a given function y(t), the FRFT with α-angle is defined as γα with the equation below: Where v represents the frequency, and t denotes time.P is the transform kernel function [17]: Where i denotes the imaginary unit.2D-FRFT can be easily implemented by 1D-FRFT .Two angles, α and β are needed for 2D-FRFT, denoted by γα,β.
Next, we calculate entropy over the 2D-FRFT decomposition result.Suppose R is discrete and random, and it falls within the value set (r1, r2,…, rn) with probability mass function of G(R), we have Where F represents expected value and Z is the entropy.Generally, we have Therefore, the FRFE of a fruit image, denoted by E is defined as:

Classifier-Multilayer Perceptron
Multilayer perceptron (MLP) is a kind of feedforward neural network with one input layer, one output layer, and several hidden layers [18].As the structure of MLP is flexible, we need to define the architecture of the MLP before training it for optimal weights/biases.[19] MLP can be used for classification and function approximation, and the performance can be superior to support vector machine, if the architecture is defined properly.

Input layer
Hidden layer Output layer Fig. 1 Architecture of an MLP Fig. 1 shows the structure of an MLP, it consists of one input layer, one hidden layers, and one output layer.The structure of an MLP is dependent on the given problem [20].Generally, the number of input neurons equals to the dimension of the feature vector, and the number of output nodes is set as the number of class [21].The weights/biases of the MLP are obtained by training [22].Many optimization algorithms have been applied, such as back propagation (BP), and particle swarm optimization (PSO) [23].
However, these methods are prone to get trapped in local extrema.Hence, it's still difficult to achieve the optimal solution.Moreover, determining the number of hidden neuron is another challenge, So far, there is no general answer to the problem of defining the structure of hidden layer.

Improved Genetic Algorithm
We employed improved hybrid genetic algorithm to train the MLP to obtain the optimal weights/biases.

Standard Genetic Algorithm
Standard genetic algorithm (GA) is widely used for optimization.GA follows the criteria that survival of the fittest, which mimics the process of natural selection [24].It searches through a population of candidate solutions to the problem, called individuals, by iteration.Each candidate solution has its own chromosomes, which are randomly initialized, and can be mutated and altered during the iterations [25].
The population of the individuals in each iteration is denoted as generations.The value of the objective function of each individual is regarded as the fitness of it [26][27][28].During each generation, the fitness of each individual is calculated.The more fit individuals are selected to breed the next generation, while the remaining individuals are eliminated [29].The new generation are formed by modifying crossover and randomly mutation of the genome of the more fit individuals.The algorithm will terminate if a candidate solution that satisfies minimum criteria is found or it reached the fixed number of generations [30].
However, GA has a poor ability of local search, which means that it is unable to adjust well the candidate solution near the potential region.Meanwhile, it's also often-observed that the quality of the offspring worsens with the generations.

Improved hybrid genetic algorithm
Improved hybrid genetic algorithm is proposed by Ahmad (2013) [31] to optimize the number of hidden nodes, weights, and feature subset in MLP [32].Firstly, it encodes the candidate solution into the chromosome, containing 3 gene segments in binary format, as is shown in Fig. 2.
Here, nt denotes the total number of the testing samples, nc the number of samples correctly classified, o the number of chosen feature, p the number of hidden nodes, q the number of output neurons, respectively.Segmented multi-chromosome crossover (SMCC) is used, which can produce a chromosome that contains gene from more than one couple chromosome.The basic steps of the improved hybrid genetic algorithm are illustrated in Table 1.
The whole diagram of IHGA is shown in Fig. 3.
Table 1 Pseudocode of the improved hybrid genetic algorithm Improved Hybrid Genetic Algorithm Step 1: Randomly initialize the population chromosomes.
Step 3: Build the MLP with the parameters obtained, and start training.
Step 4: Calculate the fitness of the MLP.
Step 5: Create the elite chromosome and SMCC chromosomes.
Step 6: Directly copy the best individual obtained to the next generation.
Step 7: Crossover and mutate the members of elite and normal pool to form the next generation.
Step 8: Repeat Step 2 to 7 until the maximum generation, and then output the optimal solution.

Fig. 4 An apple image
We take an apple image as an example as shown in Fig. 4. We take 25 difference combinations of (α, β).The results are shown in Fig. 5.The results over other types of fruits are not displayed, due to limited page.
Those 25 FRFEs were submitted to the proposed classifier -IGHA-MLP.The IGHA iterates to update the hidden neuron number, classifier weights, and selected features at every step.The parameters were set by experience as listed in Table 2. Population represents the size of individuals, Max-generation stands for the maximum iteration generations.In every generation, the best 5 chromosome are selected as the Elite chromosome, while the rest 15 are used as SMCC chromosome according to the fitness value.The elite and normal parent pools both contain 9 chromosomes.We set the elitism as 2, which means 2 individuals are copied to the next generation directly.The mutation rate and crossover rate are 0.1 and 0.6, respectively.The r is the parameter in Equation ( 6), used to weigh the contributions of the testing accuracy and the inverse complexity of MLP.

Comparison between GA and IHGA
In the third experiment, we employed FRFE to extract features, and then used the GA to train MLP and compare the results with IHGA.The results are listed in Table 3.
Here the GA only obtains an accuracy of 87.42%, and IHGA obtains an accuracy of 89.59%.IHGA is better than GA.The reason lies in the chromosome that consists of three segments: weights, hidden neuron number, and feature subset.Meanwhile, the fitness takes both testing accuracy and the complexity of the MLP structure into consideration.Moreover, SMCC used in the experiment enables that the produced chromosome contains gene from more than one couple chromosome, which benefits the evolution process.

Classification Performance
The final classification performance was listed in Table 4, and comparison with state-of-the-art approaches was also showed.We see that the (CH + MP + US) + PCA + kSVM [4] used 14 features and obtained an accuracy of 88.20%, (CH + MP + US) + PCA + FSCABC-FNN [10] 14 features and obtained an accuracy of 89.11%.WE + PCA + FSCABC-FNN [11]   Results in Table 4 suggested that the proposed method was the only one that didn't use PCA which was an effective feature reduction tool.Though 25 was the most feature number among the listed approaches, it was not too many for computers nowadays, and the time for PCA was saved.
Meanwhile, the proposed method achieved the highest accuracy of 89.59%, and FSCABC-FNN and BBO-FNN came the second and third with the same accuracy of 89.47%, which indicated the superiority of FRFT + IHGA-MLP to other approaches.The worst classifier was kSVM with 88.20% accuracy.That's because the performances of FNN and MLP exceeded kSVM when large datasets were available [33,34].
The limitations of the proposed approach lie in that the weights/biases of MLP are not understandable for common food engineering experts.Besides, the classification performance can be better with improved fitness function.

Conclusion
In this study, we proposed a new approach for fruit classification, combining FRFE and MLP optimized by improved hybrid GA.The proposed method yielded an overall accuracy of 89.59% in identifying 1653 fruits of 18 different categories, which was higher than state-of-art methods.

Table 2
Parameter Setting

Table 3
Comparison between GA and IHGA