Abstract
The transmission of the entire video and audio sequences over an internal or external network during the implementation of audio-visual recognition over internet protocol is inefficient especially when only selected data out of the entire video and audio sequences are actually used for the recognition process. Hence, in this paper, we propose an efficient method of implementing audio-visual recognition over internet protocol whereby only the extracted audio-visual features are transmitted over internet protocol. To extract the robust features from the video sequence, a multiband curvelet-based technique is employed at the client whereas a late multi-modal fusion scheme using RBF neural network is employed at the server to perform the recognition across both modalities. The proposed audio-visual recognition system is implemented on several standard audio-visual databases to showcase the efficiency of the system.
Original language | English |
---|---|
Title of host publication | Signal Processing and Information Technology |
Subtitle of host publication | First International Joint Conference, SPIT 2011 |
Editors | Vinu V. Das, Ezendu Ariwa, Syarifah Bahiyah Rahayu |
Place of Publication | Berlin, Germany |
Publisher | Springer |
Pages | 132-138 |
Number of pages | 7 |
Volume | 62 |
ISBN (Electronic) | 9783642325731 |
ISBN (Print) | 9783642325724 |
DOIs | |
Publication status | Published - 2012 |
Event | Signal Processing and Information Technology (SPIT) - Amsterdam, the Netherlands, Amsterdam, Netherlands Duration: 01 Dec 2011 → 02 Dec 2011 Conference number: 1 |
Conference
Conference | Signal Processing and Information Technology (SPIT) |
---|---|
Country/Territory | Netherlands |
City | Amsterdam |
Period | 01/12/11 → 02/12/11 |