Content-based image retrieval of cultural heritage symbols by interaction of visual perspectives

Paul W. Kwan, Keisuke Kameyama, Junbin Gao, Kazuo Toraichi

    Research output: Contribution to journalArticlepeer-review

    8 Citations (Scopus)
    67 Downloads (Pure)


    Content-based Image Retrieval (CBIR) has been an active area of research for retrieving similar images from large repositories, without the prerequisite of manual labeling. Most current CBIR algorithms can faithfully return a list of images that matches the visual perspective of their inventors, who might decide to use a certain combination of image features like edges, colors and textures of regions as well as their spatial distribution during processing. In practice, however, the retrieved images rarely correspond exactly to the results expected by the users, a problem that has come to be known as the semantic gap. In this paper, we propose a novel and extensible multidimensional approach called matrix of visual perspectives as a solution for addressing this semantic gap. Our approach exploits the dynamic cross-interaction (in other words, mix-and-match) of image features and similarity metrics to produce results that attempt to mimic the mental visual picture of the user. Experimental results on retrieving similar Japanese cultural heritage symbols called kamons by a prototype system confirm that the interaction of visual perspectives in the user can be effectively captured and reflected. The benefits of this approach are broader. They can be equally applicable to the development of CBIR systems for other types of images, whether cultural or noncultural, by adapting to different sets of application specific image features.
    Original languageEnglish
    Pages (from-to)643-673
    Number of pages31
    JournalInternational Journal of Pattern Recognition and Artificial Intelligence
    Issue number5
    Publication statusPublished - Aug 2011


    Dive into the research topics of 'Content-based image retrieval of cultural heritage symbols by interaction of visual perspectives'. Together they form a unique fingerprint.

    Cite this