Sion Hannuna, Xianghua Xie, Majid Mirmehdi, Neill Campbell



We propose a robust approach to annotating independently moving objects captured by head mounted stereo cameras that are worn by an ambulatory (and visually impaired) user. Initially, sparse optical flow is extracted from a single image stream, in tandem with dense depth maps. Then, using the assumption that apparent movement generated by camera egomotion is dominant, flow corresponding to independently moving objects (IMOs) is robustly segmented using MLESAC. Next, the mode depth of the feature points defining this flow (the foreground) are obtained by aligning them with the depth maps. Finally, a bounding box is scaled proportionally to this mode depth and robustly fit to the foreground points such that the number of inliers is maximised.


  1. Andersen, J. and Seibel, E. (2001). Real-time hazard detection via machine vision for wearable low vision aids. In ISWC 7801: Proceedings of the 5th IEEE International Symposium on Wearable Computers, page 182. IEEE Computer Society.
  2. Badino, H. (2004). A robust approach for ego-motion estimation using a mobile stereo platform. In 1st Internation Workshop on Complex Motion (IWCM04), volume 3417, pages 198-208.
  3. Birchfield, S. and Tomasi, C. (1999). Depth discontinuities by pixel-to-pixel stereo. Int. J. Comput. Vision, 35(3):269-293.
  4. Bobick, A. and Intille, S. (1999). Large occlusion stereo. Int. J. Comput. Vision, 33(3):181-200.
  5. Comaniciu, D. and Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. IEEE Trans. PAMI, 24(5):603-619.
  6. de Souza, G. and Kak, A. (2002). Vision for mobile robot navigation: A survey. 24(2):237-267.
  7. Ess, A., Leibe, B., and van Gool, L. (2007). Depth and appearance for mobile scene analysis. In ICCV07, pages 1-8.
  8. Felzenszwalb, P. F. and Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. Int. J. Comput. Vision, 59(2):167-181.
  9. Felzenszwalb, P. F. and Huttenlocher, D. P. (2006). Efficient belief propagation for early vision. Int. J. Comput. Vision, 70(1):41-54.
  10. Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381-395.
  11. Gong, M. and Yang, Y. (2007). Real-time stereo matching using orthogonal reliability-based dynamic programming. IEEE Trans. Image Processing, 16(3):879-884.
  12. Hannuna, S. (2007). Quadruped Gait Detection in Low Quality Wildlife Video. PhD thesis, University of Bristol. Supervisor-Neill Campbell.
  13. Hartley, R. I. and Zisserman, A. (2001). Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049.
  14. Kanade, T. (1994). Development of a video-rate stereo machine. In Image Understanding Workshop, pages 549- 9557.
  15. Leonard, J. (2007). Challenges for autonomous mobile robots. pages 4-4.
  16. Pauwels, K. and Hulle, M. M. V. (2004). Segmenting independently moving objects from egomotion flow fields. In Proc. Early Cognitive Vision Workshop.
  17. Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (1992). Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, New York, NY, USA.
  18. Rabe, C., Franke, U., and Gehrig, S. (2007). Fast detection of moving objects in complex scenarios. Intelligent Vehicles Symposium, 2007 IEEE., pages 398-403.
  19. Scharstein, D. and Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision, 47(1-3):7-42.
  20. Shi, J. and Tomasi, C. (1994). Good features to track. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94), Seattle.
  21. Torr, P. and Zisserman, A. (2000). Mlesac: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78:138-156.
  22. Veksler, O. (2003). Extracting dense features for visual correspondence with graph cuts. In IEEE CVPR, pages -.
  23. Wilson, J., Walker, B., Lindsay, J., and Dellaert, F. (2007). Swan: System for wearable audio navigation.
  24. Yang, Q., Wang, L., Yang, R., Wang, S., Liao, M., and Nister, D. (2006). Real-time global stereo matching using hierarchical belief propagation. In BMVC, pages -.
  25. Yu, Q., Ara ├║jo, H., and Wang, H. (2005). A stereovision method for obstacle detection and tracking in non-flat urban environments. Auton. Robots, 19(2):141-157.
  26. Yuan, C., Medioni, G. G., Kang, J., and Cohen, I. (2007). Detecting motion regions in the presence of a strong parallax from a moving camera by multiview geometric constraints. IEEE Trans. Pattern Anal. Mach. Intell., 29(9):1627-1641.
  27. Zhang, Z. (1998). Determining the epipolar geometry and its uncertainty: A review. Int. J. Comput. Vision, 27(2):161-195.
  28. Zitnick, C. and Kanade, T. (2000). A cooperative algorithm for stereo matching and occlusion detection. IEEE Trans. PAMI, 22(7):675-684.
  29. Zitnick, C. and Kang, S. (2007). Stereo for image-based rendering using image over-segmentation. Int. J. Comput. Vision, 75:49-65.

Paper Citation

in Harvard Style

Hannuna S., Xie X., Mirmehdi M. and Campbell N. (2009). GENERIC MOTION BASED OBJECT SEGMENTATION FOR ASSISTED NAVIGATION . In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2009) ISBN 978-989-8111-69-2, pages 450-457. DOI: 10.5220/0001785704500457

in Bibtex Style

author={Sion Hannuna and Xianghua Xie and Majid Mirmehdi and Neill Campbell},
booktitle={Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2009)},

in EndNote Style

JO - Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2009)
SN - 978-989-8111-69-2
AU - Hannuna S.
AU - Xie X.
AU - Mirmehdi M.
AU - Campbell N.
PY - 2009
SP - 450
EP - 457
DO - 10.5220/0001785704500457