Hierarchical Hidden Markov Model for Finger Language Recognition

The finger language is the part of the sign language, which is a language system that expresses vowels and consonants with hand gestures. Korean finger language has 31 gestures and each of them needs a lot of learning models for accurate recognition. If there exist mass learning models, it spends a lot of time to search. So a real-time awareness system concentrates on how to reduce search spaces. For solving these problems, this paper suggest a hierarchy HMM structure that reduces the exploration space effectively without decreasing recognition rate. The Korean finger language is divided into 3 categories according to the direction of a wrist, and a model can be searched within these categories. Pre-classification can discern a similar finger Korean language. And it makes a search space to be managed effectively. Therefore the proposed method can be applied on the real-time recognition system. Experimental results demonstrate that the proposed method can reduce the time about three times than general HMM recognition method.


Introduction
A study on the HCI(Human-Computer interaction) internationally has been going positively.For designing system more natural and intellectual between human and computer, there becomes important issues on visual language like sign language(SL) or finger language(FL).There are lots of trying on the research about recognition system of SL and FL.Manjula (1) accomplished 84% average of recognition rate on 14 gestures of SL at recognition system with applying artificial neural network (1) .In case of Yamaguchi (2) , there has been studied on a system that results 82% average of recognition rate using a date-glove, extracting 16 of Japan SL features from system and image which recognizes 34 FL among 46 FL.Seun-Ki Min and Hee-Deok Yang (3) , has researched on study using the data-glove and method that recognizes SL and FL based on image (4~5) .There are 2 kinds of main problems to be solved on following recognition system of SL and FL.The first problem is reducing mis-recognition rate on similar gestures.The second problem, for real-time recognition, is the need of efficient management on searching DB model.For solving these problems, this paper suggests Korean FL HMM recognition system which is applied pre-classification, which gets data of hands on 3D using Leap Motion (6) without preprocessing or wearing equipment Fig. 1.The proposed Korean finger language recognition system.

Hierarchical Korean finger language recognition system 2.1 Traget gestures
Korean is composed with basic consonant, complex consonant, basic vowel and complex vowel total 40 kinds of phoneme.As Tenchijin Keyboard is available in mobile with basic consonant and vowel of Korean, although this paper is trying to recognize 31 kinds of FL, for accurate recognition and efficient manage of system resource, basic consonants and vowels are applied on Tenchijin Keyboard that recognizes 11 FL gestures (Fig 2).

Feature extraction
According to the recent study (7) , the feature in gesture recognition is categorized by location of object, angle and velocity.The result that each of the feature value is applied to gesture recognition system shows recognition rates as 46% of extracted location of a object, 87% of angle of object and 32% of velocity of a object.As a result, the angle of a object is the most efficient feature point on the gesture recognition system.Therefore, in this paper, as based on following study, fingers angle is used as features for gesture recognition system.Leap motion provides a variety of data for the hands.In this paper, use the position of the fingers, and hand torques data.Calculating the differentials of the finger position, and generates a chain code of 27 directions.And the chain code has a total of 28 ranges by adding state of fingers Non-tracking.Acquired chain-code build 5 dimensional vector as below.
where C_thumb means the chain code of a thumb.Also means the rest fingers is the same.The direction of a wrist is categorized by Pitch, Roll and Yaw at Leap Motion.Data which is used on proposed gesture recognition system is Pitch value, which is the angle between projected Z axis and Y-Z plane.

Using HMM recognition
In simpler Markov models, it does not accurately model complex situations.Through many years of research of the Markov model, HMM (8) is developed.HMM is a double-probability-model that is used to recognize a pattern of handwriting, voice, gesture and etc. ,which solves various problems.Observation sequence, expressed with object pattern, is the way that analyzes as string and makes to learn as probability model, and calculates an observation probability for input signal.To train a model creates a probability model which has maximum probability for observation sequence.Fig. 3. 27-direction of 3 dimensional chain-code.

Pre-classification
Category of proposed Korean FL is divided into three classes by the direction of wrist as Fig. 4. Category which is supposed to be searched is selected by the Pitch value of a wrist and reference models.Korean FL defines wrist direction as 3 categories.There is two reasons doing pre-classification.First, general HMM searches the whole references for an input, but proposed method searches part of all models and can reduce search spaces.Second, among the Korean FL gestures, there exist characters that are hard to recognize because of similar gestures.For example, 'ㄷ-ㅌ' and 'ㅅ-ㅎ' are possible to consider as same gestures except the direction of wrist.By categorizing DB models, the pre-classification is able to reduce mis-recognition on similar gestures, and consequently recognition rate is increased.

Gesture spotting
Because it is not able to know the start/end point at real-time, spotting of gestures is applied as following condition (7) .When hands are moving(H-M), it regards as starting of a gesture and saves an input data as shown in Fig. 5.And grabbing hand(GH) is defined as end of the gesture.If GH state keeps during a bit of a frame, input data are compared with DB models, and the state becomes idle(NH)

Experiment
Table 1 shows recognition rate according to proposed method(H-HMM), general HMM, and each state numbers.According to the Table 1, there is a lots of differences between general HMM and H-HMM.In case of general HMM recognition method, there is no pre-classification and input data is similar to every model therefore mis-recognition rate is increased.There is a lot of difference at recognition rate on similar gesture ("ㄴ,ㄹ" and "ㄱ,ㅋ" , "ㅡ" and "ㅣ" or Additional stroke .. etc) without the direction of a wrist and less difference at recognition rate on discrimination gestures as "ㅇ,ㅁ", "ㅂ,ㅍ".Besides, study as changing state number in Korean FL recognition system using HMM accomplished the highest recognition rate on fine states.If state numbers are too small at HMM structure, recognition rate is declined.Although expanding state numbers, it shows a little difference on recognition rate and increasing matching time.Table 2 shows searching process speed on one DB model for each of methods.Proposed method takes 2~3 times lost than general HMM recognition method because of using pre-classification.Also matching time within DB is increased as increasing state number on each method.

Conclusions
In this paper, we suggest a HMM recognition system to recognize 11 kinds of finger gestures using two information that is wrist direction and finger position.The main goal of this paper is to reduce mis-recognition rate on HMM gesture recognition system and to match efficiently by the categorized direction for Korean FL gestures.To show the superiority of the proposed method, comparative experiments are conducted with general HMM recognition system.In optimal state, the recognition rate of proposed method is 90% on average and it is 10% higher than general HMM system.The proposed method showed two to three times faster than the general HMM.If we conduct a gesture spotting more naturally in real-time, it is expected to build a robust recognition system for the applications.

Fig. 4 .
Fig. 4. Korean finger language classified by the direction of a wrist.

Table 2 .
DB model matching speed.