Evaluating Deep Convolutional Neural Networks for Material Classification

Grigorios Kalliatakis, Georgios Stamatiadis, Shoaib Ehsan, Ales Leonardis, Juergen Gall, Anca Sticlaru, Klaus D. McDonald-Maier

Abstract

Determining the material category of a surface from an image is a demanding task in perception that is drawing increasing attention. Following the recent remarkable results achieved for image classification and object detection utilising Convolutional Neural Networks (CNNs), we empirically study material classification of everyday objects employing these techniques. More specifically, we conduct a rigorous evaluation of how state-of-the art CNN architectures compare on a common ground over widely used material databases. Experimental results on three challenging material databases show that the best performing CNN architectures can achieve up to 94.99% mean average precision when classifying materials.

References

  1. Bell, S., Upchurch, P., Snavely, N., and Bala, K. (2015). Material recognition in the wild with the materials in context database. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2015, Boston, MA, USA, June 7-12, 2015, pages 3479-3487.
  2. Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, BMVC 2014, Nottingham, UK, September 1-5, 2014.
  3. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., and Vedaldi, A. (2014). Describing textures in the wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014, pages 3606-3613.
  4. Csurka, G., Bray, C., Dance, C., and Fan, L. (2004). Visual categorization with bags of keypoints. Workshop on Statistical Learning in Computer Vision, ECCV, pages 1-22.
  5. Dana, K. J., van Ginneken, B., Nayar, S. K., and Koenderink, J. J. (1999). Reflectance and texture of realworld surfaces. ACM Trans. Graph., 18(1):1-34.
  6. Deng, J., Dong, W., Socher, R., Li, L., Li, K., and Li, F. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pages 248-255.
  7. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. (2014). Decaf: A deep convolutional activation feature for generic visual recognition. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, pages 647-655.
  8. Fritz, M., Hayman, E., Caputo, B., and olof Eklundh, J. (2004). THE KTH-TIPS database.
  9. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  10. Girshick, R. B., Donahue, J., Darrell, T., and Malik, J. (2013). Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524.
  11. Hu, D., Bo, L., and Ren, X. (2011). Toward robust material recognition for everyday objects. In British Machine Vision Conference, BMVC 2011, Dundee, UK, August 29 - September 2, 2011. Proceedings, pages 1-11.
  12. Huang, Y., Huang, K., Yu, Y., and Tan, T. (2011). Salient coding for image classification. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1753-1760. IEEE.
  13. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22Nd ACM International Conference on Multimedia, MM 7814, pages 675-678, New York, NY, USA. ACM.
  14. Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States., pages 1106-1114.
  15. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541-551.
  16. Liu, C., Sharan, L., Adelson, E. H., and Rosenholtz, R. (2010). Exploring features in a bayesian framework for material recognition. In The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13-18 June 2010, pages 239-246.
  17. Oquab, M., Bottou, L., Laptev, I., and Sivic, J. (2014). Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 7814, pages 1717-1724, Washington, DC, USA. IEEE Computer Society.
  18. Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. (2014). CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 7814, pages 512-519, Washington, DC, USA. IEEE Computer Society.
  19. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. (2013). Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229.
  20. Sharan, L., Rosenholtz, R., and Adelson, E. (2010). Material perception: What can you see in a brief glance? Journal of Vision, 9(8):784-784a.
  21. Simonyan, K. and Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 568-576.
  22. Varma, M. and Zisserman, A. (2009). A statistical approach to material classification using image patch exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(11):2032-2047.
  23. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. (2010). Locality-constrained linear coding for image classification. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 3360-3367.
  24. Weinmann, M., Gall, J., and Klein, R. (2014). Material classification based on training data synthesized using a BTF database. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part III, pages 156-171.
  25. Zeiler, M. D. and Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I, pages 818-833.
  26. Zheng, S., Cheng, M., Warrell, J., Sturgess, P., Vineet, V., Rother, C., and Torr, P. H. S. (2014). Dense semantic image segmentation with objects and attributes. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014, pages 3214-3221.
Download


Paper Citation


in Harvard Style

Kalliatakis G., Stamatiadis G., Ehsan S., Leonardis A., Gall J., Sticlaru A. and McDonald-Maier K. (2017). Evaluating Deep Convolutional Neural Networks for Material Classification . In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, (VISIGRAPP 2017) ISBN 978-989-758-226-4, pages 346-352. DOI: 10.5220/0006166603460352


in Bibtex Style

@conference{visapp17,
author={Grigorios Kalliatakis and Georgios Stamatiadis and Shoaib Ehsan and Ales Leonardis and Juergen Gall and Anca Sticlaru and Klaus D. McDonald-Maier},
title={Evaluating Deep Convolutional Neural Networks for Material Classification},
booktitle={Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, (VISIGRAPP 2017)},
year={2017},
pages={346-352},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006166603460352},
isbn={978-989-758-226-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, (VISIGRAPP 2017)
TI - Evaluating Deep Convolutional Neural Networks for Material Classification
SN - 978-989-758-226-4
AU - Kalliatakis G.
AU - Stamatiadis G.
AU - Ehsan S.
AU - Leonardis A.
AU - Gall J.
AU - Sticlaru A.
AU - McDonald-Maier K.
PY - 2017
SP - 346
EP - 352
DO - 10.5220/0006166603460352