Audio-visual speech recognition (AVSR) becomes a research trend recent years due to the stimulation of the restrictions rise from the automatic speech recognition (ASR).Withthe aid of visual signal, AVSR outperforms ASR under certain undesired circumstances such as noisy environments.The key element for a good performed AVSR is the capability of front end lips detection. Instead of getting through the conventional face detection process before lips detection and localization, this paper presents a direct lips detection technique using colour feature clustering without the needs of pre- face detection. The cubic spline interpolant lips color boundary is used for direct lips detection process. The detected lips are then passed to the Kalman filter-based tracking system to estimate the succeeding appearance of lips. The extracted feature coefficients from visual and audio signals are recognized separately using two independent Hidden Markov Model (HMM) and final AVSR recognition is produced after integration of both system. Simulation results have revealed a good performance of the proposed method.
|Title of host publication||2008 International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2008|
|Publication status||Published - 01 Dec 2008|
|Event||2008 International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2008 - Bangkok, Thailand|
Duration: 08 Feb 2009 → 11 Feb 2009
|Conference||2008 International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2008|
|Period||08/02/09 → 11/02/09|