Feature-augmented Trained Models for 6DOF Object Recognition and Camera Calibration

Kripasindhu Sarkar, Alain Pagani, Didier Stricker

2016

Abstract

In this paper we address the problem in the offline stage of 3D modelling in feature based object recognition. While the online stage of recognition - feature matching and pose estimation, has been refined several times over the past decade incorporating filters and heuristics for robust and scalable recognition, the offline stage of creating feature based models remained unchanged. In this work we take advantage of the easily available 3D scanners and 3D model databases like 3D-warehouse, and use them as our source of input for 3D CAD models of real objects. We process on the CAD models to produce feature-augmented trained models which can be used by any online recognition stage of object recognition. These trained models can also be directly used as a calibration rig for performing camera calibration from a single image. The evaluation shows that our fully automatically created feature-augmented trained models perform better in terms of recognition recall over the baseline - which is the tedious manual way of creating feature models. When used as a calibration rig, our feature augmented models achieve comparable accuracy with the popular camera-calibration techniques thereby making them an easy and quick way of performing camera calibration.

References

  1. Aldoma, A., Tombari, F., Rusu, R., and Vincze, M. (2012). Our-cvfh oriented, unique and repeatable clustered viewpoint feature histogram for object recognition and 6dof pose estimation. In Pinz, A., Pock, T., Bischof, H., and Leberl, F., editors, Pattern Recognition, volume 7476 of Lecture Notes in Computer Science, pages 113-122. Springer Berlin Heidelberg.
  2. Bouguet, J. Y. (2008). Camera calibration toolbox for Matlab, http://www.vision.caltech.edu/bouguetj/calib doc/.
  3. Collet Romea, A., Martinez Torres, M., and Srinivasa, S. (2011). The moped framework: Object recognition and pose estimation for manipulation. International Journal of Robotics Research, 30(10):1284 - 1306.
  4. Collet Romea, A. and Srinivasa, S. (2010). Efficient multiview object recognition and full pose estimation. In 2010 IEEE International Conference on Robotics and Automation (ICRA 2010).
  5. D'Apuzzo, N. (2006). Overview of 3d surface digitization technologies in europe. In Electronic Imaging 2006, pages 605605-605605. International Society for Optics and Photonics.
  6. Dementhon, D. and Davis, L. (1995). Model-based object pose in 25 lines of code. International Journal of Computer Vision, 15(1-2):123-141.
  7. Hao, Q., Cai, R., Li, Z., Zhang, L., Pang, Y., Wu, F., and Rui, Y. (2013). Efficient 2d-to-3d correspondence filtering for scalable 3d object recognition. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 899-906.
  8. Heikkila, J. and Silven, O. (1997). A four-step camera calibration procedure with implicit image correction. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on, pages 1106-1112.
  9. Irschara, A., Zach, C., Frahm, J.-M., and Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 2599-2606.
  10. Lepetit, V., Moreno-Noguer, F., and Fua, P. (2009). Epnp: An accurate o(n) solution to the pnp problem. Int. J. Comput. Vision, 81(2):155-166.
  11. Lowe, D. (2004). Distinctive image features from scaleinvariant keypoints. International Journal of Computer Vision, 60(2):91-110.
  12. Rusu, R., Bradski, G., Thibaux, R., and Hsu, J. (2010). Fast 3d recognition and pose using the viewpoint feature histogram. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pages 2155-2162.
  13. Skrypnyk, I. and Lowe, D. (2004). Scene modelling, recognition and tracking with invariant image features. In Mixed and Augmented Reality, 2004. ISMAR 2004. Third IEEE and ACM International Symposium on, pages 110-119.
  14. Tombari, F., Salti, S., and Di Stefano, L. (2010). Unique signatures of histograms for local surface description. In Computer Vision-ECCV 2010 , pages 356- 369. Springer.
  15. Wu, C. (2007). SiftGPU: A GPU implementation of scale invariant feature transform (SIFT). http://cs.unc.edu/ ccwu/siftgpu.
  16. Wu, C., Agarwal, S., Curless, B., and Seitz, S. (2011). Multicore bundle adjustment. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 3057-3064.
  17. Zhang, Z. (1999). Flexible camera calibration by viewing a plane from unknown orientations. In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, volume 1, pages 666-673. IEEE.
Download


Paper Citation


in Harvard Style

Sarkar K., Pagani A. and Stricker D. (2016). Feature-augmented Trained Models for 6DOF Object Recognition and Camera Calibration . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 632-640. DOI: 10.5220/0005781106320640


in Bibtex Style

@conference{visapp16,
author={Kripasindhu Sarkar and Alain Pagani and Didier Stricker},
title={Feature-augmented Trained Models for 6DOF Object Recognition and Camera Calibration},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={632-640},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005781106320640},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - Feature-augmented Trained Models for 6DOF Object Recognition and Camera Calibration
SN - 978-989-758-175-5
AU - Sarkar K.
AU - Pagani A.
AU - Stricker D.
PY - 2016
SP - 632
EP - 640
DO - 10.5220/0005781106320640