Classifying Human and Animal Hair Using Probabilistic Neural Networks for Texture Classification

Hair Analysis has been widely used in the field of forensics especially in finding out if a piece of hair evidence in a crime scene is considered valid. This study proposes a classification method for human head hair and animal hair using Probabilistic Neural Network. Hair samples were mounted in a compound microscope, digitized using a camera and stitched using method used by Rosebrock. Images were magnified 100 times and 400 times. Sobel edge detection was used for texture analysis. Gray-level cooccurrence matrix was applied to all samples to extract features. Results were then fed to a Probabilistic Neural Network for classification. The datasets were validated using the 2-fold cross validation, wherein 2 training sets and 2 test sets were made. An 84.615 percent accuracy was achieved for normalized data and a 100 percent accuracy was achieved for unnormalized data.


Introduction
Hair can be used as a vital evidence at any crime scene.Normally, hair falls from the scalp of an individual at any time of the day and it can easily stick to several materials, such as fabric and clothing.Hair, even if exposed to moisture and to decomposition of an accompanied tissue, cannot be destroyed easily.To validate if the hair taken as evidence in a crime scene is valid, researchers must undergo Hair Analysis (1) .Hair analysis may refer to several types.It may + Both Authors have equal contribution.
refer to chemical analysis of a hair sample or a microscopic analysis or comparison.Microscopic analysis has been widely used in forensics.It is usually used in comparing hair samples from crime scenes and samples from suspects.Until today, it is still acknowledged as another alternative technique in forensic science once DNA analysis fails in a crime scene investigation (2) .
Zafarina and Panneerchelvam (3) in their study on an unidentified animal specie named Jenglot, used DNA analysis to ascertain its specie.The Jenglot is considered as a rare living thing.
In a study conducted by Kshirsagar, et.al. (4) , the most quantitative parameter was used to differentiate human hair from animal hair.Here, hair samples were mounted in a microscope with the use of a xylene.The mounted hair slides were examined for micrometry and morphological characteristics.
In Kiranmayee and Subbarao (5) , Probabilistic Neural Networks were used to classify MR images.The network was used as a classifier and garnered a workable accuracy of at least 75 to 100 percent.
The hair examination process involves many different steps in microscopic analysis.The first step is to determine whether the hair in question originated from an animal or a human being.After, the examination then branches out if whether the result would be human or animal hair (6) .
Machine Learning is growing and is widely used in Forensics.An intelligent system capable of identifying if a hair sample belongs to a human or to an animal is the basic aim of this study.Its purpose is to classify digital microscopic images of animal and human head hair.Automating manual procedures by developing intelligent systems can help speed up production of evidences.
This study is significant because it contributes to the production of a fully automated hair analysis.This detects human and animal hair through digital forensic hair analysis.The advantage of making hair analysis digital is hastening the analysis process itself.It may take quite some time in doing the analysis process since humans have to scrutinize the samples properly.
The output of this study can be used in advanced studies in hair analysis, forensics, texture classification, and deep learning.Identifying the classification of the hair sample can also help strengthen an evidence and identify the source and the suspect of the crime.

Pipeline of Work
This study focused on the classification and analysis of microscopic digital images of human and animal hair using Computer Vision and Machine Learning Techniques.This study used Python 2.7 (64-bit) (7) , OpenCV-Python (8) , SciPy (9) , and Neupy (10) running on an Intel core i7 in its implementation.
The whole method of the study, as shown in Fig 1 , shows the basic and straightforward structure of the process in classifying human and animal hair through texture classification with the use of Probabilistic Neural Networks.In the data sampling method, first, hair from domesticated animals and humans were collected.Digital images were captured from a 12MP EOS camera and examined through a microscope using 100x and 400x magnifications.In the Image Preprocessing method, photos of the parts of the samples were stitched together to form the sample back into a whole image.Texture Segmentation was applied to the stitched image in preparation for analysis.Feature extraction was performed with the use of the gray-level co-occurrence matrix.The extracted features along with their labels were then placed in a comma-separated values (csv) file in order to form a dataset.The dataset was fed into the feature normalization function.After normalizing the features, the dataset was fed to the PNN classifier for sample classification.

Data Sampling
The data collected were from human hair and animal hair.For the human hair, 5 male hair samples and 5 female hair samples were collected, while for the animal hair 3 kinds of domesticated animal hair samples were taken, specifically from rabbits, cats, and dogs.Each strand of each hair sample were divided or cut into five pieces.Each piece was placed on a microscopic slide.The samples were then placed on a compound microscope to capture the images of the samples.The magnification of the microscope used to capture the images were 100x and 400x.Each sample had to be taken numerous times in order to capture all parts of the sample and saved as a JPG format

Image Preprocessing
This study needed to stitch the images taken from the microscope in order for the learning algorithm to understand better the features of the image.Samples that were not stitched resulted to a poor feature output.All samples must be taken using the same platform to minimize the effect of noise during sampling.
Two images taken from the same hair specimen were selected in order to be stitched to a bigger hair sample.Panorama Image Stitching Procedure designed by Rosebrock (11) was used in order to perform the stitching process.Here, the two images must be arranged from left to right to form a panoramic image.Rosebrock's stitching process used the following four steps: (1) Detect key points by using the Scale-Invariant Feature Transform (SIFT) algorithm (12) .( 2 Random sample consensus (RANSAC) Algorithm (13) in order to estimate a homography matrix using the generated matched feature vectors.(4) Align and stitch the image smoothly and hide probable seams by implementing a warping transformation to the generated homography matrix in step 3.
In Fig 2, Image A and Image B were the input image IA(x,y) and IB(x,y), respectively, that will be stitched together.The first step in the Image stitching process would be the SIFT algorithm.To perform the sift algorithm, one must construct a scale space.This is done by using the Difference of Gaussian (DoG) function.To be able to get the DoG of image IX(x,y) (a specific image), one must calculate for the formula of the function itself through each image (Image A and Image B).
The next step would be to find the key points of the DoG converted images.Maxima and minima of the images were identified to refine the key points detected.These were obtained using Equation 2.
Low contrast key points detected earlier was calculated using the Hessian Matrix in Equation 3.
The next step was to obtain key point orientations since orientations was provided more rotation invariance, which was done by collecting gradient directions and magnitudes in each key point.For each image example L(x,y), the gradient magnitude m(x,y), the orientation O(x,y) was computed as: The final step for the SIFT algorithm was to generate the features.A fingerprint for each key point was assigned in order to identify a key point easily.Through this, series of matrices were computed and normalized to get the features.
Algorithm of Rosebrock was slightly altered by adding a cropping feature that can help remove the black edges or parts of the resulting stitched photo.This is necessary to eliminate noise in the image processed.This was done by finding the contours in the image and the corners that bind the resulting image from the black parts.
After SIFT algorithm, feature matching was applied to the two input images I(x,y) in order to detect similarities of Images A and B. This was done by calculating the Euclidean distance measure of the two Images calculated using Equation 6.
The third step in Image Stitching was calculating the Homorgraphy Matrix Estimation through the formula: x' = Mx.(7)   The next step is to perform the RANSAC algorithm by using the generated Homography Matrix.
The last step is by performing warping transformation with the use of the given estimated Homography Matrix.The image shown in Fig 3 is the result of performing image stitching to both Images A and B. No black portions have been found since the parts were cropped properly by the additional algorithm added to the image stitching process, which was stated earlier in this section.

The Medulla
Examining medulla found in a hair shaft can also help determine the difference of human and animal hairs.Medulla is a cellular column running through the center of the cortex in a hair shaft.In humans, if hair is filled with hair, the medulla will appear dark under transmitted light.If filled with fluid, the medulla may take on a yellowish color.If the medulla is not easily seen under normal microscopic examination, it may be easily observed between crossed polar of a polarizing microscope (1) .Animal and human hair differ through medulla patterns.The medulla in human hairs is amorphous in appearance.Its width is more likely less than one-third the overall diameter of the hair shaft.However, the medulla in animal hairs is continuous and structured.Its' width occupies an area of greater than one-third the overall diameter of the hair shaft (14) .Since examining the medulla primarily deals with patterns or textures and has been used as one of the basis to classify human and animal hair, using Texture Classification would be the best way to classify them with the use of Machine Learning techniques.Texture classification has two phases, particularly the feature extraction phase and the classification phase.In the feature extraction phase, features that are invariant to irrelevant transformation of the image are identified and selected.Ideally, quantifiable measures of selected measures must have similar results to similar textures.In the classification phase, the classifiers are trained to be able to find out the classification for each input texture based on the quantifiable measures that were obtained from the selected features (15) .Probabilistic Neural Networks are used for the classification method since its target is categorical.It also presents good accuracy rates with very small training time. (16).

Texture Segmentation
Segmentation plays an important part in image analysis and high level image interpretation and understanding.Texture-based segmentation are used to analyze images because of its' panchromatic nature (17) .
To perform image segmentation, Sobel Edge Detection was implemented.Gradient of the image was calculated, where the gradient of the image is calculated from each pixel positon.
The magnitude of the vector of an image ∆f is denoted as: Figures 4 to 7 shows the resulting image after Sobel edge detection was performed.

Features of Figures 4 to 7 were extracted using the
Gray-level Co-occurrence matrix (GLCM).GLCM is widely used in texture analysis (18) .Before calculating The Graylevel Co-occurrence matrix of the segmented image P[i,j] and by counting all pairs of pixels separated by vector d having gray level i and j.Features were extracted from the Co-occurrence matrix by calculating contrast, energy, homogeneity, correlation, and dissimilarity.To calculate the features from the matrix, the following equations were calculated (18,19) : This study used five features because they produce significant results in classifying given hair samples.
Tables 1 to 4 shows the results of GLCM Features Extracted from an Animal Hair and Human with 100x and 400x magnifications.
In creating the dataset, extracted features of each segmented image sample along with their labels, classified as human or animal, were annotated in a csv file format.
Animal and human hair sample were well represented by placing equal number of animal and human hair in the dataset.The datasets must be coherent in image magnification.In one dataset, all hair image samples were taken from a specific magnification, which was either 100x or 400x in magnification.

Probabilistic Neural Networks
Tables 1 to 4 were fed to the Probabilistic Neural Network (PNN) classifier to classify the samples.PNN was used in this study because it generated good accuracy.Training time was 0.0278189182281 microseconds.

PNN Architecture
PNN accepted five input units and used Lazy Learning.Lazy Learning algorithm does not involve any iterative training procedure.It stores parameters and uses them to make predictions (8) .

Performing hair classification using 100x and 400x edge detected datasets
The image hair sample went through image stitching and texture segmentation.Features were then extracted from the segmented image.The features were fed to the PNN classifier.The datasets with 100x or 400x magnification were cross validated using the stratified 2-fold cross validation, wherein 2 test sets and 2 training sets were made in finding out the classification of a hair image sample in a specific image magnification.The highest accuracy rate garnered from the two test sets from each magnification datasets were used as basis for the discussion of the accuracy results.
In Table 5, it is shown that normalizing datasets will decrease the accuracy performance.With this, it is better to not perform normalization to the data than to perform it.

400x Magnification Image Datasets
In Table 6, it is shown that normalizing datasets by features will decrease the accuracy rate of the classifier.The datasets that was normalized by sample did not provide any decrease or increase in the accuracy.With this, it is better to unnormalize the data than to normalize it.

Overall Accuracy Performance
As shown in Fig 8, it can be seen that datasets that have 400x and 100x magnifications are better to be unnormalized than normalized since both gained an accuracy rate of 100%.The normalized data sets did not have consistent results and most of the accuracy rates decreased after performing normalization.

Conclusions
Based on the results of the implementation of the algorithm, animal hair and human head hair can be classified with the use of texture segmentation and neural network techniques.The edge detection used as the texture segmentation technique is feasible in helping classify and analyze hair since it provided accuracy rate in the neural network of 100% in both 100x and 400x magnifications.It is also better to unnormalize the data since the accuracy rates decrease when Fig. 2. Two cross section animal hair samples to be stitched.From the left to right is Image A and Image B

Fig. 3 .
Fig. 3.The resulting image after performing image stitching in Images A and B.

Table 1 .
Sample GLCM Features Extracted from an Animal Hair with Magnification at 400x.

Table 2 .
Sample GLCM Features Extracted from a HumanHair with Magnification at 400x.

Table 3 .
Sample GLCM Features Extracted from an Animal Hair with Magnification at 100x.

Table 4 .
Sample GLCM Features Extracted from a HumanHair with Magnification at 100x.

Table 5 .
PNN Classifier Accuracy Measure Using 100x Magnification Edge Detection Datasets.

Table 6 .
PNN Classifier Accuracy Measure Using 400x Magnification Edge Detection Datasets.normalization is performed to the data.
Fig. 8. Overall Accuracy Performance of the PNN