Spline Curve Modeling Based Gait Recognition

Gait recognition has become an active research area with increasing demand for effective video surveillance systems. This paper deals with an innovative method of modelling human gait with spline curves. The method proposed involves finding the locations of several human joints namely, coxal joint, a pair of knee joints and a pair of ankle joints. The five joints located are used as control points to construct spline curve. Instead of comparing the gait models constructed, for which time complexity is high, we consider the area under the spline curve constructed, which is a linear metric, as our gait feature and construct feature vector containing area signals of the sequence of images considered. DCT (Discrete Cosine Transform) is applied to the feature vector to obtain the feature matrix. The dimensional reduction of the constructed feature matrix is achieved by adopting the method of MSPCA (Multi-scale Principal Component Analysis). The classification of the feature vectors is done using K-NN and Neuro-Fuzzy classifiers, for the subjects considered in CASIA datasets A, B and DTW (dynamic time warping) for the subjects in CASIA dataset C.


Introduction
Human identification at a distance has gained a lot of attention recently due to increasing need for video surveillance systems. Gait is an attractive feature for human identification at a distance and has gained a lot of interest from computer-vision researchers in the recent past.The genesis of the idea of human tracking can be traced back to Cutting and Kozlowski's perception experiments based on light point displays [1] [2]. In stark contrast with the conventional biometric features such as face, iris, palm print and finger print, Gait has unique characteristics such as being noncontact, non-invasive and perceivable at a distance.
Gait recognition's pragmatic implementation faces several challenges. For instance Gait analysis is very sensitive to deficient or incomplete segmentation of the subject silhouette. Variations in clothing and footwear, distortions in gait pattern produced by carrying objects or walking speed could make analysis an arduous task. These complexities lead to low recognition rates in the algorithms proposed so far. Existing methods on Gait recognition can be classified into model based ones and holistic ones. Model based methods model the human body with appropriate geometric curves. Holistic methods extract spatio-temporal and statistical features.
A. Model based method: Model based approaches describe the topology of human body parts using geometrical curves. One of the first attempts at modelling could be seen in [3] and Cunado et al [4], in which the legs were considered as interlinked pendulum. Then, phase weighted Fourier magnitude spectrum was used to recognize the Gait signatures, which were derived from frequency components of variations in human thigh inclination. Lee et al [5] fit ellipses to seven regions of human body and derived magnitude and phase of these moment based region features. Furthermore, statistical methods were used such as, Principal Component Analysis (PCA), and Multiple Discriminant Analysis (MDA) to analyse effective features.
B. Holistic /Model free methods: The holistic methods characterize spatial variation in the dynamic variables like stride length, width vector, etc. They analyse the variations in shape and distance vectors in the sequence of images to characterize the gait features. Early efforts at gait recognition adopting holistic approach can be traced back to Niyogi and Adelson [2], who distinguished different subjects from their spatiotemporal gait patterns obtained from the curve fitted snake . Little and Boyd [6] used frequency and phase features from optical flow information of walking figures to differentiate individuals. Chai et al [7], introduced perpetual shape descriptor to analyze human gait.
R.Tanawongsuwan and A.Bobick [8] used timenormalized joint angle trajectories to create gait signatures.
Though a lot of progress has been achieved using the above stated approaches, there is no foolproof method established, which is why the scope of research in this area is diverse.
Though a lot of well established gait recognition methods exist, they are either sensitive to variations in silhouette shape or covariate features like walking speed, clothing of the subject. This was the prime motivation for our proposed method.
In this paper a novel feature, adopting model based approach, to model the lower limbs of the silhouette is proposed. The motive behind this approach was to mitigate the sensitivity of recognition to variations in silhouette shape due to the variations in clothing or carrying conditions. This method was also applied to the cases where the walking speed of the subject is variable.
The rest of the paper is organized into 5 sections. Section 2 and 3 deal with Approach overview and preprocessing. Section 4 deals with gait feature extraction, MSPCA dimensional reduction. Section 5 deals with experimental results and comparison with recent methods. Section 6 concludes the paper.

Approach Overview
The proposed approach can be implemented in the following steps: 1) Background subtraction technique is used to extract the silhouette from the background, and preprocessed to remove noise components introduced.
2) The silhouette is resized by cropping, to create image template [2]. A gait cycle is then extracted by exploiting the variation in the width vector, as a feature.
3) The proposed feature namely, area under the limbs of the subject, is computed after modeling the lower limbs with spline curves, DCT is applied on the feature matrix created and MSPCA is adopted for dimensional reduction of area signals extracted.

4)
After dimensional reduction, the feature matrix is fed to Neuro-fuzzy and K-NN classifiers for evaluation.

Silhouette Extraction
Silhouette extraction holds great importance in effective gait analysis. This is essential so as to analyze the value of each pixel in every frame in the video sequence. The method of background subtraction is adopted to acquire the subject of interest. Here, the subject should be the only object in motion in the sequence of frames.

Image Template
After background subtraction it is apparent that the subject occupies a small area of the image. To eliminate the redundant boundary around the object that occupies a larger portion of image, we resize the image by cropping the extra portion and fit the subject into a smaller image template choosing appropriate width and height so that the image is not corrupted. Firstly, height of the human silhouette is chosen as the height of the image and secondly a fixed width is chosen in such a way so as to avoid most of the computational ambiguities. This type of scaling not only reduces computational complexity but also corrects the scale changes due to the variation of object distance from the camera. Similar work can be seen in [9].

Gait Feature Extraction
From the gait silhouette sequence obtained, the only cue to identify the gait signature depends on the temporal changes in the silhouette. We propose a novel silhouette modeling method which uses spline curves to model the limbs. The procedure involves finding the coordinates of coxal joint, two knee joints and two ankle joints of each silhouette. The five joints thus found, are used as interpolating points to construct a cubic spline curve. The procedure for finding the joints and constructing the spline curve is enumerated in the following sections.

Joint Positioning
The novel feature extracted in this paper, the area under the limbs, requires silhouette's joints as interpolating points. The control points on the curve are the coxal, ankle and knee joints which are obtained by the process below: a) Coxal Point -The y co-ordinate is at 0.72H from the top of the image. When horizontal scanning is done it leads to the following cases: One Region: The center of the region is taken as the coxal point.
Two regions: This happens if our scanning position is below the actual coxal hence we need to regulate the scanning width 0.165H to find the coxal point b) Knee Point -A circle with radius 0.245H is drawn with the coxal point as the center. Two cases arise here Two Regions: This is the condition of left and right biped bracing. Center of each region is the corresponding knee joint.
One Region: This is when the left or right knee standing. The human knee is about 0.1H wide, so we choose the point 0.05H left/right from the rightmost/leftmost point as the right/left knee joint. c) Ankle joint: This is similar to the knee joint. The left and right knees are chosen as the centers of the circles and length of shank 0.246H as radius.

Area under the spline curve
As observed from a sequence of frames, the area under the limbs has a periodic temporal variance just like width vector of silhouette. This area is found by constructing a spline curve and finding the area under the limbs enclosed by the curve. For constructing an interpolating curve given a set of points, there are three different possibilities namely, polynomial interpolation, Bézier curves and spline curves. All three methods produce polynomial curves as a linear combination of a set of basis polynomials. Our choice of spline curves is based on their properties which allow us to design complex shapes with lower degree polynomials as compared to the other two methods. In the Fig. 3, B-spline curve of degree 3 and Bézier curve of degree 10 are constructed for the same set of control points and it is pretty evident that the Bézier curve still cannot follow the polyline.
Since the degree of the constructed interpolating curve is lower using splines the computational time which is O(n2), for the same is reduced considerably, with n as its degree.
The interpolating spline curve has the human body joints as its control points, namely coxal joint, a pair of knee and ankle coordinates. A polygon was first constructed with the body joints as its vertices, and then a cubic spline curve was constructed with the joints as control points.
A spline is a piecewise-polynomial real function The restriction of S to an interval i is a polynomial So that, The highest order of P i is known as the order of spline curve, which in our case is 3. For a spline of order n, S is required to be continuously differentiable to order n − 1 at the points ti for all i = 1, 2, · · · k − 1 and all j ∈ [0, n − 1] In our method of spline curve interpolation of knee joints, we use the B-form of spline curves which is a weighted sum with the weights as B-spline functions. The spline f (t) is given by The function B i,d is called a B-spline of degree d which is given by the recursive formula Thus for each silhouette image, we obtain the area under the spline curve constructed and for given N training samples and M images in each, we create a feature matrix This matrix is considered for further processing, using Discrete Cosine Transform (DCT) to describe the area feature better, followed by dimensional reduction using MSPCA.

Multi-scale principal component analysis
The dimensionality of the feature matrix containing the area signals is very large and contains redundant information so, we adopt the method of Multi scale principal component analysis (MSPCA) to find transformation for dimensionality reduction. MSPCA was first proposed by Bakshi [10], for statistical process monitoring. Multi scale principal component analysis (MSPCA) combines the ability of PCA to decorrelate the variables by extracting a linear relationship, with that of wavelet analysis to extract deterministic features. MSPCA implements PCA to wavelet coefficients at each scale to filter the unwanted components. The essence of MSPCA is enumerated in Fig. 5 and Fig. 6.
W is the Discrete wavelet transform (DWT) operator, Implementing Inverse DWT (IDWT),Ŷ can be reconstructed via (14). Denote Where τ j is defined as in [10]. Traditional PCA is then applied onŶ, the wavelet coefficients matrix, to acquire the final feature matrix which is fed to the classifiers for recognition in the following subsection.
Due to its multi-scale nature, MSPCA is appropriate for modeling of data containing contributions from events whose behavior changes over time and frequency. Process monitoring by MSPCA involves combining only those scales where significant events are detected, and is equivalent to adaptively filtering the scores and residuals, and adjusting the detection limits for easiest detection of deterministic changes in measurements.

Recognition
After the extraction of gait features, followed by dimensional reduction classification is done using two different classifiers namely, KNN and Neuro-fuzzy. First we evaluate the proposed method using Neuro-Fuzzy classifier as it is the main method of classification adopted. Then we compare the achieved results with the results of K-NN classifier. The gait feature matrix extracted using the proposed method is used to train the classifiers.

Gait database
In our experiments, we used the CASIA Gait Database which is one of the largest gait databases in gait-research community currently. We have tested the algorithm on the CASIA Gait database due to its completeness and wide availability. • CASIA Dataset-B The database consists of 124 subjects (93 males and 31 females) captured from 11 view angles (ranging from 0 to 180 degrees, with view angle interval of 18). The frame size is 320 × 240 pixels, and the frame rate is 25 fps. There are 10 walking sequences for each subject per view. We use gait sequences numbered from 001 to 124 (subject ID, i.e., 124 subjects) of view angle 90 degrees in Dataset B to carry out our experiments.
Out of the 10 samples chosen from each subject, 2 samples have images with subject carrying a bag, and 2 have subject wearing a coat.
• CASIA Dataset-C The Infrared -CASIA C dataset was chosen to evaluate the performance of the proposed algorithm. It contains 153 subjects and takes into account four walking conditions namely, normal walking , slow walking , fast  walking and normal walking with a bag. Each subject has got 10 sequences, 4 normal walking (fn), 2 slow walking (fs), 2 fast walking (fq) and 2 normal walking carrying a bag (fb). The length of each sequence varies with the pace of walking.

Experimental Results on CASIA datasets A,B
The proposed feature matrix is transformed by applying the Discrete Cosine Transform (DCT) and then reduced in dimensionality using the proposed MSPCA method. The first implementation of this method is on CASIA dataset-A, considering 4 samples for each of the 20 subjects. Of the 4 samples we chose, 3 samples are fed to the classifier for training and 1 sample is put aside for testing. Cumulative match scores (CMS) are used to assess the performance quantitatively. The CMS value δ corresponding to rank r indicates a fraction 100.δ % of probes whose top r matches must include the real identity matches.
Unlike the Neuro-fuzzy classifier that uses membership functions extracted from the data set describing the system, K-NN applies Euclidean distances as the measurement parameter in classifying the data. The test results of K-NN classifier are enumerated in the Table 1.
Four different methods of testing are adopted to compute the accuracies: directly using Neuro-fuzzy classifier on the feature matrix and the cosine transform coefficient matrix; using Neuro-fuzzy classifier on the feature matrix and the cosine transform coefficient matrix after dimensional reduction using the proposed MSPCA method. The results are as shown in Table 2.
In order to test the robustness of our proposed feature extraction method, we also test our algorithm's performance on K-NN classifier. We adopt similar strategies of testing as in the case of Neuro-fuzzy classifier.
The above results demonstrate the robustness of our method to changes in direction of motion of the subject in dataset-A. Our method of modeling spline lower limbs using spline curves was found to be very effective, even though the direction of subject's motion changed in two samples. The best accuracy of 95% retained is promising and the method itself is quite feasible for recognition Two different strategies are used to test the proposed algorithm on CASIA dataset-B: Discrete Cosine Transform is applied on the feature matrix and the coefficient matrix acquired undergoes training and testing using Neuro-fuzzy classifier; Dimensional reduction of cosine transform coefficient matrix is done by MSPCA and then fed to the Neurofuzzy classifier.
The results presented on dataset-B in Table 3, show convincing results even after considering covariate features, in which subject is either carrying a bag or wearing a bulky coat.
This shows that our method is robust to these covariate features and the best accuracy of 91.2% acquired is in itself, quite feasible for recognition, considering the fact that covariate features are taken into account.
The consistent CMS for all the 6 sets of a subject shows that our method is robust to covariate features of CASIA dataset-B. The best accuracy of 97.1% CMS was obtained for set 7 in which the subject carries a bag, strengthening the claim of our algorithm's robustness to covariate features. The feature, area under the limbs, chosen is clearly insensitive to subject wearing a bulky coat or carrying a bag which makes it much more effective and reliable for recognition.
From the CMS curves plotted it has been observed that the high accuracy is achieved for modest values of rank of the KNN classifier used. This ensures higher confidence in the classification of the subjects and moderates the error percentage.  Recognition method Best CCR (%) Su-li [11] 89.7 Chen [12] 95.2 Proposed method 97.1

Comparison
In this section we compare the performance of the proposed method with two recently proposed methods. Table 4 presents the gait based recognition rates of various algorithms proposed recently. Su-li et al [11], proposed a feature extraction method based on Fuzzy principal component analysis. They use the CASIA database-A with 20 subjects under consideration. chen et al [12] proposed a method based on Frame difference energy image. They performed experiments on CMU Mobo gait database and the CASIA dataset B with 100 subjects under consideration. Note that the numerical accuracies from these two techniques are obtained from CMS curves.
It is pretty evident from the table that the proposed Algorithm outperforms the other Algorithms in terms of Cumulative match scores (CMS). The best CMS of 97.1% was obtained over CASIA data set B, with 124 subjects under consideration. Our experimental results show that the method of MSPCA performs fairly good even with complications like carrying of covariate objects involved which is our key interest.

Dynamic time Warping
Dynamic time warping is an algorithm to find an optimal match between two sequences that vary in time or speed. Similarities in walking patterns would be detected, even if in one video the person was walking slowly and if in another he or she were walking more quickly, which makes it ideal for varying walking speed gait recognition. Detailed analysis of Dynamic time warping algorithm can be seen in [13].
The Dynamic time warping is very effective even when the sampling rate of two different video sequences is different. This classification method coupled with the multi-scale principal component analysis method was found to have a considerable impact in enhancing the accuracy of recognition.

Experimental Results on CASIA-C dataset
Another covariate feature that often leads to devious results in gait recognition is the walking speed of the subject. To

Input Template
Optimal path DTW distance calculated from the optimal predecessor's DWT distance The optimal predecessor has the smallest DWT distance counter this problem we have adopted the DTW (Dynamic Time Warping) as mentioned earlier.
We perform the experiments with probe sequences from CASIA C dataset, where subject's motion is parallel to image plane. We use Dynamic time warping method to find the optimal path distance between the probe sequence and the reference sequence.
Using DTW the distances between each frame of the probe sequence and the reference sequence are computed and then the total distance is defined as the accumulated distance along the optimal distance path which is also termed as optical warping path. This distance is used as a metric to compare the similarity between the probe sequence and the gallery sequence.
Having computed the distances between the probe and the reference sequences the best match decision is taken on the basis of Where D i j is the accumulated distance matrix for i th probe sequence and the j th reference sequence. This means the best match to the test sequence is assumed to be the reference sequence with the least distance ( Fig. 9 and Fig. 10).
After the calculation of distance between the two sequences and the similarity measure is established a thresh-  old value has to be selected so that the sequences with distance lower than the threshold are ACCEPTED and the ones with value higher than the threshold are REJECTED.
For training, 3 sequences from 50 subjects were considered to determine the rejection threshold. 3 sequences from the 50 subjects are used as the enrollment and 3 other sequences and all the sequences of remaining 103 subjects are taken as probes to establish the FAR (False acceptance rate ) and FRR (False rejection rate). The mean FAR and FRR is determined and the rejection threshold is selected after acquiring the EER (Error equal rate). In the test phase, the threshold defined previously is used to decide whether the probe sequence is a match to reference. In addition to this, cumulative match scores (CMS) are used to assess the performance quantitatively as in [14]). The CMS value δ corresponding to rank r indicates a fraction 100.δ % of probes whose top r matches must include the real identity matches. The performance percentages presented in Table  5 and Table 6 are rank 1 CMS values acquired in each test case which means the closest sequence to probe sequence is selected from the gallery sequence. As seen from Fig. 9 and Fig. 10 the optimal path in a spectrogram for two similar sequences is almost linear as opposed to the irregular path in the case of different sequences.
Experiments were performed with different pairs of sequences under consideration ,one of each type from fs, fq, fn as gallery and probe sequences to evaluate the crossspeed gait recognition performance of the proposed algorithm and the results are as shown in Table 5. The proposed gait recognition algorithm achieves high accuracy on within walking condition tests. For cross speed walking conditions only tests D, F, G achieve good accuracies, which is because the normal walking sequences are still a close match to fast walking and slow walking sequences. Moderate accuracies are achieved in E, H, I test cases as the important factor, walking speed comes into the picture which can significantly vary the walking patterns. The proposed algorithm could still achieve fairly good results with varying walking speed condition tests. The CMS curves for the above scenarios are shown in Fig. 11 . The proposed gait recognition algorithm is tested on the remaining two sequences with subject carrying a bag. The results of the evaluation involving sequences fb are shown in Table  6. High accuracies are achieved with all the 4 tests J, K, L and M in which one of the sequences has subject carrying a bag. The CMS curves for the above scenarios are shown in Fig. 12 .
The fairly good results show that the proposed method of spline curve modeling is insensitive to carrying of covariate objects as it involves modeling of only the lower limbs. The promising results of cross-speed comparison between fq and fb ascertain that the method is invariant to speed variations as well. Table 7 shows the comparison of the proposed method with Tan [14] and WBP [15] approaches on CASIA C database. Note that the numerical accuracies from these two techniques are obtained from CMS curves. For completeness the values of FAR and FRR are evaluated as 2.37% and 3.07% respectively. The proposed algorithm though it is on par with other two methods in the first two cases, it significantly outperforms the other two methods for the last two cases. The last case with subject carrying a bag is highly accurate as compared with the other two methods showing that the proposed method is insensitive to carrying of covariate objects.

Conclusion
In this paper, we propose a novel method for gait recognition based on modeling of the limbs using spline curves. The Area signals obtained after feature matrix construction are compared. With the help of MSPCA, the components of the feature matrix are projected into a lower dimension space. MSPCA retains the information of original data better as compared to the traditional PCA even when data sequence changes over time or frequency. Neuro-Fuzzy and K-NN classifiers are used for classification of feature vectors in case of subjects from CASIA datasets A, B. DTW is adopted to classify the subjects in case of CASIA dataset C, so as to reduce the sensitivity of recognition to variations in walking speed. The Experimental results demonstrate the insensitivity of our method to covariate features like subject's walking Speed, subject carrying a bag or wearing a thick coat.
Reducing the sensitivity of gait recognition to the above mentioned covariate features was the preeminent concern of the method proposed. Decent results obtained on a large database like CASIA with covariate features ascertain the feasibility of our method.