Content-based Image Retrieval (CBIR) has been an active area of research for retrieving similar images from large repositories, without the prerequisite of manual labeling. Most current CBIR algorithms can faithfully return a list of images that matches the visual perspective of their inventors, who might decide to use a certain combination of image features like edges, colors and textures of regions as well as their spatial distribution during processing. In practice, however, the retrieved images rarely correspond exactly to the results expected by the users, a problem that has come to be known as the semantic gap. In this paper, we propose a novel and extensible multidimensional approach called matrix of visual perspectives as a solution for addressing this semantic gap. Our approach exploits the dynamic cross-interaction (in other words, mix-and-match) of image features and similarity metrics to produce results that attempt to mimic the mental visual picture of the user. Experimental results on retrieving similar Japanese cultural heritage symbols called kamons by a prototype system confirm that the interaction of visual perspectives in the user can be effectively captured and reflected. The benefits of this approach are broader. They can be equally applicable to the development of CBIR systems for other types of images, whether cultural or noncultural, by adapting to different sets of application specific image features.
|Number of pages||31|
|Journal||International Journal of Pattern Recognition and Artificial Intelligence|
|Publication status||Published - Aug 2011|