Evaluation of Learning Samples for Fish Recognition with Machine Learning

Recent years, the invasive exotic fishes are destroying the ecosystem of the native fishes has happened all over the world. There are many approach are considered to prevent this happening, but no effective measure is figured out to eradicate the exotic fish by now. Therefore we propose a new method by using image processing. This new method is aimed to distinguish the specified exotic fish from the local native fish automatically by using underwater camera. In this study, we perform fish recognition to discriminate whether fish exists or not in the image. This process is an important pretreatment for determining fish species automatically. Fish recognition is more difficult than human face recognition because fish has no symmetry and have not an obvious unique feature can be used. Especially the image taken in underwater could cause a large quality distortion. In order to evaluate the efficiency of exist algorithm in this situation, in this paper, we statistically examine that what kind of learning methods applied for what kind of learning samples made from fish body features can bring good results.


Introduction
As described in the abstract, the invasive exotic fish cause a variety of problems all over the world [1].For example, eating native fish, genetic pollution by mating and bringing parasite and illness which have never existed in the local region.These problem adversely affect the biodiversity, agriculture and forestry, fishery, and human healthy.Actually, many countries are affected by such problems.In particular black bass is the most famous creature among the exotic fish in Japan.Black bass is very berserk and eat a large amount of fish.Therefore the ecosystem of native fish is subjected to a large damage if someone brings the black bass to the river or lake illegally.This fish has given a variety of influence in Japan.Figure .1.shows the total catch of Lake Biwa.As you can see, the mount of fish catches is decreasing year by year.This is said cause by blacks bass.Now, fishing black bass by people hands or pulling out the water of the pond and lake is the countermeasure for exotic fish.However this method requires a lot of cost and personnel.It is difficult to say that efficient.Therefore, we need to suggest a more efficient method.The method is that we set up an underwater camera in the ponds and lakes, and use the proposed algorithm to discriminate whether exotic fish or native fish exist.If a fish passing through in the front of the camera is exotic fish, we can vanish it automatically by an underwater robot.progress day by day.For example, human face detection is a very success one of them.Actually, the camera equipped with a face detection sensor has been sold all over the world.However recognition of a fish is more difficult than human face, because human face is characterized by symmetry and the geometric structure is not changed so much.Objects with non-symmetry are difficult to take the feature point.In this study, we improve the accuracy of the object detection with non-symmetry by attaching conditions.Through this study, we research the new method, which discriminate exotic fish.

Method
In this research, our destination is to select a proper area from fish body image as a learning pattern.To perform this evaluation, we use an existing machine learning algorithm called AdaBoost.It is described in following.

Flow of object recognition
Learning and recognition are two phases of object recognition.Flow of Adaboot shows in Fig. 2.
In learning phase, in order to construct the data of learning result, collection of a large amount of positive images and negative images are needed.Then, feed these images into learning program to extract features.At here Haar-Like features are used.Next, learn by using these features with an iterative procedure which called "AdaBoost".Finally, gets the data of learning result.
For recognition phase, first, extract features from the input image.Second, match the input image and data with data of learning result.Finally, it outputs the result.In this study, it focuses on the learning phase and improve the accuracy.We explain below about these details.

Learning image
There are two types of learning images.One is positive image that the target object is included in it.The other is negative image that the target object must not be included in it.The negative images can be background images or any non-target objects' image.

Haar-Like features
Haar-Like features are scalar quantity which is obtained as a difference value of the average brightness in rectangular area.This value represents the intensity of the brightness gradient.It doesn't dependent on absolute brightness value and extract features which corresponds to the texture.Haar-Like features are shown Fig. 3.The difference of the luminance values between white and black part define as features.We show the Eq.(1).
where, (r 1 ) is average brightness of the white areas, (r 2 ): is the average brightness of the black areas.
Haar-Like features have three features pattern, "Edge features", "Line features", "Center-surround features".It is possible to extract the fine features by making full use of them.Haar-Like features pattern is shown figure.4.

AdaBoost [5]
To classify the fish from other object by using these Haar-Like features, it is necessary to select the correct Haar-Like feature from all of the candidates.AdaBoost is used here.It is algorithm that makes a strong classifier by combining the weak classifiers which has a low individual capacity.It will take a lot of time if applying all of the Haar-like patterns randomly to make weak classifiers.But using this algorithm, it is possible to determine the meaningful Haar-Like pattern candidates.Therefore, using these candidates to make weak classifiers will save a lot of time.
(a) Pretreatment First prepare the N number of learning samples.( 1 ,  1) , … , (  ,   ).Then   is images, and   is class label.Class label is indicator of whether the image is correct or not.For example, when the detection of human, the human image is given label of +1 and no-human image is given label of -1.
(b) Initialization of the weights of the learning sample The initial weights of each sample   () are set to equal.It is shown in the expression (2).5) which calculates the weight ∝  from error rate.
(f) Update the weights of learning samples Eq. ( 6) shows the update of weights of learning samples   . +1 () =   (i)exp(−∝    ℎ  (  )) (6) (g) Normalization of the weights of the learning sample Normalize the weights of learning samples   is shown in Eq.( 7) (h) Construction of the final classifier Give the weights to all weak classifier and take a majority.For example, when determining the human, it judges human if H is higher than threshold λ.It is shown in the expression (8).

Experiment
We created several original object classifiers by using different learning pattern using the previously mentioned algorithm.Then using these classifiers to recognize the fish to see how change of the detection rate by the different learning pattern.The detector is set to surround the object with a red frame if the detecting result is positive.And if the surrounded location has a positive object in there, we treat it as a correct result, if not, a false recognition.
First we prepare positive images and negative images 1,000 pieces for each.Figure 6 and Fig .7 show the examples of the positive image and negative image respectively.Fig. 6: examples of the correct image Fig. 7: examples of the non-correct image As mentioned above, to detect a deformed object is difficult because there are variations in feature.Therefore we select five different kinds of body feature as positive image to make the classifier.
(1) Entire of the body pattern.

Learning result of the fish's eye pattern
In this experiment, only 1 images in 100 sheets is accurately recognized the eye.Figure 11 show examples of the correct and the false recognition results.

Learning result of the fish's body surface pattern
In this experiment, 9 images in 100 sheets is accurately recognized the body surface.Figure 12 show examples of the correct and the false recognition results.

Conclusions
In this study, recognition rate using entire body learning data is the best.The accuracy is 64%.Although this value is not high enough, it can detect the fish with a high probability if the fish pass through the camera more than once.
Furthermore, in future we are planning to make a hybrid judgment by combining the result with other body feature detectors to improve the reliability.
In this experiment, the pattern learning was performed with positive and negative images 1,000 pieces for each.This is not thought a enough number for machine learning, we should collect more samples in future work and make a more accuracy recognition result.

Fig. 2 .
Fig.2.Flow of image recognition ) Candidate of weak classifiers Candidate of weak classifiers are basically constructed randomly.However we must choose the classifiers which the error rate is 0.5 or less.In this study, we design weak classifiers based on Haar-Like features as shown in equation (3), because it uses Haar-Like features.ℎ  () = { 1 ・() > ・ −1 ℎ (3) ℎ  : t round of weak classifiers : Features : Threshold : Variable that sets the inequality of direction(1 or -1) (d) The calculation of the error rate Calculate the error rate  , for a weak classifier.If response value of the weak classifiers h and class label is difference, add the weights of the samples.It is shown in the Eq.(4). , = ∑   ()  :ℎ  (  )≠  (4) Also if the error rate exceeds 0.5, invert the parity p of weak classifiers.It is shown in the figure.5.

FromFig. 5 .
Fig.5.Reversal of parity p of weak classifiers (e) Calculate the weight ∝  for the adopted weak classifiers It is shown in the Eq.(5) which calculates the weight ∝  from error rate.

( 2 )
Face pattern (3) Eye pattern (4) Body surface pattern (5) Dorsal fin pattern Fig.8: examples of the learning part After learning, we verify one by one by using 100 pieces of fish images collected from the Internet.Eq. (9) shows the definition of numerical accuracy.Learning result of the entire body patternIn this experiment, 64 of 100 images are accurately recognized.Figure9shows the examples of the correct and the false recognition results.

( 5 )
Fig.10: Result of the recognition of the face Recognition rate and false recognition rate is shown in the following.

( 1 )( 2 )( 3 )
Fig.11: Result of the recognition of the eye Recognition rate and false recognition rate is shown in the following.

Table 1 .
Recognition rate and false recognition rate is shown in the following.Experimental results.