A Turntable-based Approach for Ground Truth Tracking Data Generation

Zoltán Pusztai, Levente Hajder

2016

Abstract

Quantitative evaluation of feature trackers can lead significant improvements in accuracy. There are widely used ground truth databases in the field. One of the most popular datasets is the Middlebury database to compare optical flow algorithms. However, the database does not contain rotating 3D objects. This paper proposes a turntable-based approach that fills this gap. The key challenge here is to calibrate very accurately the applied camera, projector, and turntable. We show here that this is possible, even if just a simple chessboard plane is used for the calibration. The proposed approach is validated on 3D reconstruction and ground truth tracking data generation of real-world objects.

References

  1. Alcantarilla, P. F., Bartoli, A., and Davison, A. J. (2012). Kaze features. In ECCV (6), pages 214-227.
  2. Alcantarilla, P. F., Nuevo, J., and Bartoli, A. (2013). Fast explicit diffusion for accelerated features in nonlinear scale spaces. In British Machine Vision Conf. (BMVC).
  3. Anwar, H., Din, I., and Park, K. (2012). Projector calibration for 3d scanning using virtual target images. International Journal of Precision Engineering and Manufacturing, 13(1):125-131.
  4. Audet, S.and Okutomi, M. (2009). A user-friendly method to geometrically calibrate projector-camera systems. In Computer Vision and Pattern Recognition Workshops, pages 47 - 54.
  5. Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M., and Szeliski, R. (2011). A database and evaluation methodology for optical flow. International Journal of Computer Vision, 92(1):1-31.
  6. Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. (2008). Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3):346-359.
  7. Bj örck, A°. (1996). Numerical Methods for Least Squares Problems. Siam.
  8. Fischler, M. and Bolles, R. (1981). RANdom SAmpling Consensus: a paradigm for model fitting with application to image analysis and automated cartography. Commun. Assoc. Comp. Mach., 24:358-367.
  9. ”Fitzgibbon, A. W., Cross, G., and Zisserman, A. (”1998”). ”automatic 3D model construction for turn-table sequences”. In ”3D Structure from Multiple Images of Large-Scale Environments, LNCS 1506” , pages ”155- 170”.
  10. Gauglitz, S., Höllerer, T., and Turk, M. (2011). Evaluation of interest point detectors and feature descriptors for visual tracking. International Journal of Computer Vision, 94(3):335-360.
  11. Hartley, R. I. and Sturm, P. (1997). Triangulation. Computer Vision and Image Understanding: CVIU, 68(2):146- 157.
  12. Hartley, R. I. and Zisserman, A. (2003). Multiple View Geometry in Computer Vision. Cambridge University Press.
  13. Jamil Draréni, Sébastien Roy, P. S. (2009). Geometric video projector auto-calibration. In Proceedings of the IEEE International Workshop on Projector-Camera Systems, pages 39-46.
  14. Kazo, C. and Hajder, L. (2012). High-quality structuredlight scanning of 3D objects using turntable. In IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom) , pages 553-557.
  15. Lepetit, V., F.Moreno-Noguer, and P.Fua (2009). Epnp: An accurate o(n) solution to the pnp problem. International Journal Computer Vision, 81(2):155-166.
  16. Leutenegger, S., Chli, M., and Siegwart, R. Y. (2011). Brisk: Binary robust invariant scalable keypoints. In Proceedings of the 2011 International Conference on Computer Vision, pages 2548-2555.
  17. Liao, J. and Cai, L. (2008). A calibration method for uncoupling projector and camera of a structured light system. In IEEE/ASME International Conference on Advanced Intelligent Mechatronics, pages 770 - 774.
  18. Lowe, D. G. (1999). Object recognition from local scaleinvariant features. In Proceedings of the International Conference on Computer Vision, ICCV 7899, pages 1150-1157.
  19. Martynov, I., Kamarainen, J.-K., and Lensu, L. (2011). Projector calibration by ”inverse camera calibration”. In SCIA, volume 6688 of Lecture Notes in Computer Science, pages 536-544.
  20. Moreno, D. and Taubin, G. (2012). Simple, accurate, and robust projector-camera calibration. In 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, Zurich, Switzerland, October 13-15, 2012, pages 464-471.
  21. Muja, M. and Lowe, D. G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In In VISAPP International Conference on Computer Vision Theory and Applications, pages 331-340.
  22. Nayar, S. K., Krishnan, G., Grossberg, M. D., and Raskar, R. (2006). Fast separation of direct and global components of a scene using high frequency illumination. ACM Trans. Graph., 25(3):935-944.
  23. Pal, C. J., Weinman, J. J., Tran, L. C., and Scharstein, D. (2012). On learning conditional random fields for stereo - exploring model structures and approximate inference. International Journal of Computer Vision, 99(3):319-337.
  24. Park, S.-Y. and Park, G. G. (2010). Active calibration of camera-projector systems based on planar homography. In ICPR, pages 320-323.
  25. Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011). Orb: An efficient alternative to sift or surf. In Proceedings of the 2011 International Conference on Computer Vision, ICCV 7811, pages 2564-2571.
  26. Sadlo, F., Weyrich, T., Peikert, R., and Gross, M. H. (2005). A practical structured light acquisition system for point-based geometry and texture. In Symposium on Point Based Graphics, Stony Brook, NY, USA, 2005. Proceedings, pages 89-98.
  27. Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Nesic, N., Wang, X., and Westling, P. (2014). High-resolution stereo datasets with subpixel-accurate ground truth. In Pattern Recognition - 36th German Conference, GCPR 2014, Münster, Germany, September 2-5, 2014, Proceedings, pages 31-42.
  28. Scharstein, D. and Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47:7-42.
  29. Scharstein, D. and Szeliski, R. (2003). High-accuracy stereo depth maps using structured light. In CVPR (1), pages 195-202.
  30. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 17-22 June 2006, New York, NY, USA, pages 519-528.
  31. Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. (”2012”). ”a benchmark for the evaluation of rgb-d slam systems”. In ”Proc. of the International Conference on Intelligent Robot Systems (IROS)” .
  32. Xu, Y. and Aliaga, D. G. (2007). Robust pixel classification for 3d modeling with structured light. In Proceedings of the Graphics Interface 2007 Conference, May 28- 30, 2007, Montreal, Canada, pages 233-240.
  33. Yamauchi, K., Saito, H., and Sato, Y. (2008). Calibration of a structured light system by observing planar object from unknown viewpoints. In 19th International Conference on Pattern Recognition, pages 1-4.
  34. Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334.
Download


Paper Citation


in Harvard Style

Pusztai Z. and Hajder L. (2016). A Turntable-based Approach for Ground Truth Tracking Data Generation . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 498-509. DOI: 10.5220/0005719404980509


in Bibtex Style

@conference{visapp16,
author={Zoltán Pusztai and Levente Hajder},
title={A Turntable-based Approach for Ground Truth Tracking Data Generation},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={498-509},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005719404980509},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP, (VISIGRAPP 2016)
TI - A Turntable-based Approach for Ground Truth Tracking Data Generation
SN - 978-989-758-175-5
AU - Pusztai Z.
AU - Hajder L.
PY - 2016
SP - 498
EP - 509
DO - 10.5220/0005719404980509