In this chapter, we extend the research work in Wong  and enhance the audio-visual recognition system over internet protocol. The multiband feature fusion method is presented to solve the illumination problem. Then a radial basis function neural network with a new orthogonal least square algorithm is proposed to improve the generalization of the radial basis function neural network with conventional orthogonal least square algorithm. Result shows that the proposed neural network achieves higher recognition accuracy with lesser number of neurons as compared to the neural network with conventional orthogonal least square algorithm. With this neural network, the recognition accuracy of the audio-visual recognition system is improved as compared to the audio-visual recognition system in Wong . Then the audio-visual recognition system over internet protocol is developed where the multiband feature fusion and the proposed neural network are implemented.