Simultaneous Object Classification and Viewpoint Estimation using Deep Multi-task Convolutional Neural Network

Ahmed J. Afifi; Olaf Hellwich; Toufique A. Soomro

doi:10.5220/0006544001770184

Simultaneous Object Classification and Viewpoint Estimation using Deep Multi-task Convolutional Neural Network

Ahmed J. Afifi, Olaf Hellwich, Toufique A. Soomro

2018

Abstract

Convolutional Neural Networks (CNNs) have shown an impressive performance in many computer vision tasks. Most of the CNN architectures were proposed to solve a single task. This paper proposes a CNN model to tackle the problem of object classification and viewpoint estimation simultaneously, where these problems are opposite in terms of feature representation. While object classification task aims to learn viewpoint invariant features, viewpoint estimation task requires features that capture the variations of the viewpoint for the same object. This study addresses this problem by introducing a multi-task CNN architecture that performs object classification and viewpoint estimation simultaneously. The first part of the CNN is shared between the two tasks, and the second part is two subnetworks to solve each task separately. Synthetic images are used to increase the training dataset to train the proposed model. To evaluate our model, PASCAL3D+ dataset is used to test our proposed model, as it is a challenging dataset for object detection and viewpoint estimation. According to the results, the proposed model performs as a multi-task model, where we can exploit the shared layers to feed their features for different tasks. Moreover, 3D models can be used to render images in different conditions to solve the lack of training data and to enhance the training of the CNNs.

Download

Paper Citation

in Harvard Style

Afifi A., Hellwich O. and Soomro T. (2018). Simultaneous Object Classification and Viewpoint Estimation using Deep Multi-task Convolutional Neural Network. In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 5: VISAPP; ISBN 978-989-758-290-5, SciTePress, pages 177-184. DOI: 10.5220/0006544001770184

in Bibtex Style

@conference{visapp18,
author={Ahmed J. Afifi and Olaf Hellwich and Toufique A. Soomro},
title={Simultaneous Object Classification and Viewpoint Estimation using Deep Multi-task Convolutional Neural Network},
booktitle={Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 5: VISAPP},
year={2018},
pages={177-184},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006544001770184},
isbn={978-989-758-290-5},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 5: VISAPP
TI - Simultaneous Object Classification and Viewpoint Estimation using Deep Multi-task Convolutional Neural Network
SN - 978-989-758-290-5
AU - Afifi A.
AU - Hellwich O.
AU - Soomro T.
PY - 2018
SP - 177
EP - 184
DO - 10.5220/0006544001770184
PB - SciTePress