Lips detection for audio-visual speech recognition system

Siew Wen Chin, Li Minn Ang, Phooi Seng Kah Phooi Seng

    Research output: Book chapter/Published conference paperConference paperpeer-review

    6 Citations (Scopus)

    Abstract

    Audio-visual speech recognition (AVSR) becomes a research trend recent years due to the stimulation of the restrictions rise from the automatic speech recognition (ASR).Withthe aid of visual signal, AVSR outperforms ASR under certain undesired circumstances such as noisy environments.The key element for a good performed AVSR is the capability of front end lips detection. Instead of getting through the conventional face detection process before lips detection and localization, this paper presents a direct lips detection technique using colour feature clustering without the needs of pre- face detection. The cubic spline interpolant lips color boundary is used for direct lips detection process. The detected lips are then passed to the Kalman filter-based tracking system to estimate the succeeding appearance of lips. The extracted feature coefficients from visual and audio signals are recognized separately using two independent Hidden Markov Model (HMM) and final AVSR recognition is produced after integration of both system. Simulation results have revealed a good performance of the proposed method.

    Original languageEnglish
    Title of host publication2008 International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2008
    DOIs
    Publication statusPublished - 01 Dec 2008
    Event2008 International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2008 - Bangkok, Thailand
    Duration: 08 Feb 200911 Feb 2009

    Conference

    Conference2008 International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2008
    Country/TerritoryThailand
    CityBangkok
    Period08/02/0911/02/09

    Fingerprint

    Dive into the research topics of 'Lips detection for audio-visual speech recognition system'. Together they form a unique fingerprint.

    Cite this