Selim Benhimane, Hesam Najafi, Matthias Grundmann, Yakup Genc, Nassir Navab, Ezio Malis



Real-time tracking of complex 3D objects has been shown to be a challenging task for industrial applications where robustness, accuracy and run-time performance are of critical importance. This paper presents a fully automated object tracking system which is capable of overcoming some of the problems faced in industrial environments. This is achieved by combining a real-time tracking system with a fast object detection system for automatic initialization and re-initialization at run-time. This ensures robustness of object detection, and at the same time accuracy and speed of recursive tracking. For the initialization we build a compact representation of the object of interest using statistical learning techniques during an off-line learning phase, in order to achieve speed and reliability at run-time by imposing geometric and photometric consistency constraints. The proposed tracking system is based on a novel template management algorithm which is incorporated into the ESM algorithm. Experimental results demonstrate the robustness and high precision of tracking of complex industrial machines with poor textures under severe illumination conditions.


  1. Baker, S. and Matthews, I. (2004). Lucas-kanade 20 years on: a unifying framework. IJCV, 56(3):221-255.
  2. Baker, S., Patil, R., Cheung, K., and Matthews, I. (2004). Lucas-kanade 20 years on: Part 5. Technical Report CMU-RI-TR-04-64, Robotics Institute, CMU.
  3. Bay, H., Tuytelaars, T., and Van Gool, L. J. (2006). Surf: Speeded up robust features. In ECCV, pages 404-417.
  4. Belongie, S., Malik, J., and Puzicha, J. (2002). Shape matching and object recognition using shape contexts. PAMI, 24(4):509-522.
  5. Benhimane, S. and Malis, E. (2006). Integration of euclidean constraints in template-based visual tracking of piecewise-planar scenes. In IROS, pages 1218- 1223.
  6. Boffy, A., Tsin, Y., and Genc, Y. (2006). Real-time feature matching using adaptive and spatially distributed classification trees. In BMVC.
  7. Buenaposada, J. and Baumela, L. (2002). Real-time tracking and estimation of planar pose. In ICPR.
  8. Cobzas, D. and Jagersand, M. (2004). 3D SSD tracking from uncalibrated video. In Proc. of Spatial Coherence for Visual Motion Analysis, in conjunction with ECCV.
  9. Comaniciu, D. and Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. PAMI, 24(5):603-619.
  10. Foley, J. D., van Dam, A., Feiner, S. K., and Hughes, J. F. (1990). Computer graphics: principles and practice. Addison-Wesley Longman Publishing Co., Inc.
  11. Hager, G. and Belhumeur, P. (1998). Efficient region tracking with parametric models of geometry and illumination. PAMI, 20(10):1025-1039.
  12. Ke, Y. and Sukthankar, R. (2004). PCA-SIFT: a more distinctive representation for local image descriptors. In CVPR, pages 506-513.
  13. Lepetit, V. and Fua, P. (2006). Keypoint recognition using randomized trees. PAMI, 28(9):1465-1479.
  14. Li, Y., Tsin, Y., Genc, Y., and Kanade, T. (2005). Object detection using 2d spatial ordering constraints. In CVPR.
  15. Lowe, D. (2004). Distinctive image features from scaleinvariant keypoints. IJCV, 60(2):91-110.
  16. Matas, J., Chum, O., Urban, M., and Pajdla, T. (2002). Robust wide baseline stereo from maximally stable extremal regions. In BMVC.
  17. Matthews, I., Ishikawa, T., and Baker, S. (2003). The template update problem. In BMVC.
  18. Mikolajczyk, K. and Schmid, C. (2004). Scale and affine invariant interest point detectors. IJCV, 60(1):63-86.
  19. Mikolajczyk, K. and Schmid, C. (2005). A performance evaluation of local descriptors. PAMI, 27(10).
  20. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A. C., Matas, J., Schaffalitzky, F., Kadir, T., and Van Gool, L. (2005). A comparison of affine region detectors. IJCV, 65(1-2):43-72.
  21. Najafi, H., Genc, Y., and Navab, N. (2006). Fusion of 3D and appearance models for fast object detection and pose estimation. In ACCV, pages 415-426.
  22. Schneider, P. J. and Eberly, D. (2002). Geometric Tools for Computer Graphics. Elsevier Science Inc.
  23. Sepp, W. and Hirzinger, G. (2003). Real-time texture-based 3-d tracking. In DAGM Symposium, pages 330-337.
  24. Tuytelaars, T. and Van Gool, L. (2004). Matching widely separated views based on affine invariant regions. IJCV, 59(1):61-85.

Paper Citation

in Harvard Style

Benhimane S., Najafi H., Grundmann M., Genc Y., Navab N. and Malis E. (2008). REAL-TIME OBJECT DETECTION AND TRACKING FOR INDUSTRIAL APPLICATIONS . In Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008) ISBN 978-989-8111-21-0, pages 337-345. DOI: 10.5220/0001074903370345

in Bibtex Style

author={Selim Benhimane and Hesam Najafi and Matthias Grundmann and Yakup Genc and Nassir Navab and Ezio Malis},
booktitle={Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)},

in EndNote Style

JO - Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)
SN - 978-989-8111-21-0
AU - Benhimane S.
AU - Najafi H.
AU - Grundmann M.
AU - Genc Y.
AU - Navab N.
AU - Malis E.
PY - 2008
SP - 337
EP - 345
DO - 10.5220/0001074903370345