Audio-visual recognition system in compression domain

Yee Wan Wong, Kah Phooi Seng, Li-Minn Ang

    Research output: Contribution to journalArticlepeer-review

    5 Citations (Scopus)
    5 Downloads (Pure)


    This paper presents a highly efficient audio-visual recognition system in compression domain. For face recognition systems, the multiband feature fusion method selects the wavelet subbands that are invariant to illumination and facial expression variations. These subbands will be extracted directly from the inverse quantization in the compression system. By taking the inverse quantized wavelet coefficient of the video as the input, the inverse wavelet transform which corresponds to image reconstruction is omitted. As a result, the computational complexity of the conventional video-based face recognition system is reduced. We also present a set of new face localization methods to localize the facial wavelet coefficients from the wavelet subband image. The dual optimal multiband feature fusion method is then used to fuse the two set of wavelet coefficients and generate the visual scores. Experimental results show that with low computational complexity, the proposed system achieves high recognition accuracy in UNMC-VIER, CUAVE, and XM2VTS audio-visual database.
    Original languageEnglish
    Pages (from-to)637-646
    Number of pages10
    JournalIEEE Transactions on Circuits and Systems for Video Technology
    Issue number5
    Publication statusPublished - May 2011


    Dive into the research topics of 'Audio-visual recognition system in compression domain'. Together they form a unique fingerprint.

    Cite this