Simultaneous object classification and viewpoint estimation using deep multi-task convolutional neural network

Ahmed J. Afifi, Olaf Hellwich, Toufique A. Soomro

    Research output: Book chapter/Published conference paperConference paperpeer-review

    4 Citations (Scopus)
    93 Downloads (Pure)

    Abstract

    Convolutional Neural Networks (CNNs) have shown an impressive performance in many computer vision tasks. Most of the CNN architectures were proposed to solve a single task. This paper proposes a CNN model to tackle the problem of object classification and viewpoint estimation simultaneously, where these problems are opposite in terms of feature representation. While object classification task aims to learn viewpoint invariant features, viewpoint estimation task requires features that capture the variations of the viewpoint for the same object. This study addresses this problem by introducing a multi-task CNN architecture that performs object classification and viewpoint estimation simultaneously. The first part of the CNN is shared between the two tasks, and the second part is two subnetworks to solve each task separately. Synthetic images are used to increase the training dataset to train the proposed model. To evaluate our model, PASCAL3D+ dataset is used to test our proposed model, as it is a challenging dataset for object detection and viewpoint estimation. According to the results, the proposed model performs as a multi-task model, where we can exploit the shared layers to feed their features for different tasks. Moreover, 3D models can be used to render images in different conditions to solve the lack of training data and to enhance the training of the CNNs.
    Original languageEnglish
    Title of host publicationProceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
    Subtitle of host publicationVolume 5: VISAPP
    EditorsFrancisco Imai, Alain Tremeau, Jose Braz
    Place of PublicationPortugal
    PublisherScitepress
    Pages177-184
    Number of pages8
    Volume5
    ISBN (Electronic)9789897582905
    DOIs
    Publication statusPublished - 2018
    Event13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2018 - Madeira Island, Funchal, Portugal
    Duration: 27 Jan 201829 Jan 2018
    http://www.visapp.visigrapp.org/?y=2018 (Conference page)

    Conference

    Conference13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2018
    Country/TerritoryPortugal
    CityFunchal
    Period27/01/1829/01/18
    OtherThe International Conference on Computer Vision Theory and Applications aims at becoming a major point of contact between researchers, engineers and practitioners on the area of computer vision application systems. Five simultaneous tracks will be held, covering all different aspects related to computer vision: Image Formation and Preprocessing; Image and Video Analysis and Understanding; Motion, Tracking and Stereo Vision; and Applications and Services.
    Internet address

    Fingerprint

    Dive into the research topics of 'Simultaneous object classification and viewpoint estimation using deep multi-task convolutional neural network'. Together they form a unique fingerprint.

    Cite this