An Integrated System based on Binocular Learned Receptive Fields for Saccade-vergence on Visually Salient Targets

Daniele Re, Agostino Gibaldi, Silvio P. Sabatini, Michael W. Spratling

Abstract

The human visual system uses saccadic and vergence eyes movements to foveate interesting objects with both eyes, and thus exploring the visual scene. To mimic this biological behavior in active vision, we proposed a bio-inspired integrated system able to learn a functional sensory representation of the environment, together with the motor commands for binocular eye coordination, directly by interacting with the environment itself. The proposed architecture, rather than sequentially combining different functionalities, is a robust integration of different modules that rely on a front-end of learned binocular receptive fields to specialize on different sub-tasks. The resulting modular architecture is able to detect salient targets in the scene and perform precise binocular saccadic and vergence movement on it. The performances of the proposed approach has been tested on the iCub Simulator, providing a quantitative evaluation of the computational potentiality of the learned sensory and motor resources.

References

  1. Antonelli, M., Gibaldi, A., Beuth, F., et al. (2014). A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot. Autonomous Mental Development, IEEE Transactions on, 6(4):259-273.
  2. Ballard, D. H., Hayhoe, M. M., Pook, P. K., and Rao, R. P. (1997). Deictic codes for the embodiment of cognition. Behavioral and Brain Sciences, 20(04):723-742.
  3. Beira, R., Lopes, M., Prac¸a, M., Santos-Victor, J., Bernardino, A., Metta, G., Becchi, F., and Saltarén, R. (2006). Design of the Robot-Cub (iCub) head. In Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., pages 94-100. IEEE.
  4. Borji, A., Ahmadabadi, M. N., Araabi, B. N., and Hamidi, M. (2010). Online learning of task-driven objectbased visual attention control. Image and Vision Computing, 28(7):1130-1145.
  5. Bruce, N. and Tsotsos, J. (2005). Saliency based on information maximization. In Advances in neural information processing systems, pages 155-162.
  6. Chinellato, E., Antonelli, M., Grzyb, B. J., and Del Pobil, A. P. (2011). Implicit sensorimotor mapping of the peripersonal space by gazing and reaching. IEEE Trans. on Autonomous Mental Development, 3:43-53.
  7. Crawford, J. D. and Guitton, D. (1997). Visual-motor transformations required for accurate and kinematically correct saccades. Journal of Neurophysiology, 78(3):1447-1467.
  8. Daugman, J. G. (1985a). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. JOSA A, 2(7):1160-1169.
  9. Daugman, J. G. (1985b). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. JOSA A, 2(7):1160-1169.
  10. Gerhardstein, P. and Rovee-Collier, C. (2002). The development of visual search in infants and very young children. Journal of Experimental Child Psychology, 81(2):194-215.
  11. Gibaldi, A., Canessa, A., and Sabatini, S. (2015a). Vergence control learning through real V1 disparity tuning curves. In Neural Engineering (NER), 2015 7th International IEEE/EMBS Conference, pages 332-335.
  12. Gibaldi, A., Canessa, A., Solari, F., and Sabatini, S. (2015b). Autonomous learning of disparity-vergence behavior through distributed coding and population reward: Basic mechanisms and real-world conditioning on a robot stereo head. RAS, 71:23-34.
  13. Gibaldi, A., Chessa, M., Canessa, A., Sabatini, S., and Solari, F. (2010). A cortical model for binocular vergence control without explicit calculation of disparity. Neurocomputing, 73(7):1065-1073.
  14. Gibaldi, A., Sabatini, S. P., Argentieri, S., and Ji, Z. (2015c). Emerging spatial competences: From machine perception to sensorimotor intelligence. Robotics and Autonomous Systems, (71):1-2.
  15. Gibaldi, A., Vanegas, M., Canessa, A., and Sabatini, S. P. (2016). A portable bio-inspired architecture for efficient robotic vergence control. International Journal of Computer Vision, pages 1-22.
  16. Houghton, G. and Tipper, S. P. (1994). A model of inhibitory mechanisms in selective attention.
  17. Hu, Y., Xie, X., Ma, W.-Y., Chia, L.-T., and Rajan, D. (2004). Salient region detection using weighted feature maps based on the human visual attention model. In Pacific-Rim Conference on Multimedia, pages 993- 1000. Springer.
  18. Hung, G. K., Semmlow, J. L., and Ciufferda, K. J. (1986). A dual-mode dynamic model of the vergence eye movement system. IEEE Transactions on Biomedical Engineering, (11):1021-1028.
  19. Hunter, D. W. and Hibbard, P. B. (2015). Distribution of independent components of binocular natural images. Journal of vision, 15(13):6-6.
  20. Hunter, D. W. and Hibbard, P. B. (2016). Ideal binocular disparity detectors learned using independent subspace analysis on binocular natural image pairs. PloS one, 11(3):e0150117.
  21. Hyvärinen, A. and Hoyer, P. (2000). Emergence of phaseand shift-invariant features by decomposition of natural images into independent feature subspaces. Neural computation, 12(7):1705-1720.
  22. Hyvärinen, A., Hurri, J., and Hoyer, P. O. (2009). Natural Image Statistics: A Probabilistic Approach to Early Computational Vision., volume 39. Springer Science & Business Media.
  23. Itti, L., Koch, C., and Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis & Machine Intelligence, (11):1254-1259.
  24. Lonini, L., Zhao, Y., Chandrashekhariah, P., Shi, B. E., and Triesch, J. (2013). Autonomous learning of active multi-scale binocular vision. In Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on, pages 1-6.
  25. Ma, Y.-F. and Zhang, H.-J. (2003). Contrast-based image attention analysis by using fuzzy growing. In Proceedings of the eleventh ACM international conference on Multimedia, pages 374-381. ACM.
  26. Muhammad, W. and Spratling, M. (2015). A neural model of binocular saccade planning and vergence control. Adaptive Behavior, 23(5):265-282.
  27. Ognibene, D. and Baldassare, G. (2015). Ecological active vision: Four bioinspired principles to integrate bottom-up and adaptive top-down attention tested with a simple camera-arm robot. Autonomous Mental Development, IEEE Transactions on, 7(1):3-25.
  28. Ohzawa, I., DeAngelis, G., and Freeman, R. (1990). Stereoscopic depth discrimination in the visual cortex: neurons ideally suited as disparity detectors. Science, 249(4972):1037-1041.
  29. Okajima, K. (2004). Binocular disparity encoding cells generated through an infomax based learning algorithm. Neural Networks, 17(7):953-962.
  30. Olshausen, B. A. et al. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607-609.
  31. Olshausen, B. A. and Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision research, 37(23):3311-3325.
  32. Orquin, J. L. and Loose, S. M. (2013). Attention and choice: A review on eye movements in decision making. Acta psychologica, 144(1):190-206.
  33. Pfeifer, R., Lungarella, M., and Iida, F. (2007). Selforganization, embodiment, and biologically inspired robotics. science, 318(5853):1088-1093.
  34. Pobuda, M. and Erkelens, C. J. (1993). The relationship between absolute disparity and ocular vergence. Biological Cybernetics, 68(3):221-228.
  35. Pouget, A. and Sejnowski, T. J. (1997). Spatial transformations in the parietal cortex using basis functions. Journal of cognitive neuroscience, 9(2):222-237.
  36. Prince, S., Pointon, A., Cumming, B., and Parker, A. (2002). Quantitative analysis of the responses of V1 neurons to horizontal disparity in dynamic random-dot stereograms. Journal of Neurophysiology, 87(1):191-208.
  37. Qian, N. (1994). Computing stereo disparity and motion with known binocular cell properties. Neural Computation, 6(3):390-404.
  38. Ralf, H. and Bethge, M. (2010). Evaluating neuronal codes for inference using fisher information. In Advances in neural information processing systems.
  39. Rao, R. and Ballard, D. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci., 2(1):79-87.
  40. Rashbass, C. and Westheimer, G. (1961). Disjunctive eye movements. The Journal of Physiology, 159(2):339.
  41. Ruesch, J., Lopes, M., Bernardino, A., Hornstein, J., Santos-Victor, J., and Pfeifer, R. (2008). Multimodal saliency-based bottom-up attention a framework for the humanoid robot icub. In Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on, pages 962-967. IEEE.
  42. Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., and Poggio, T. (2005). A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. Technical report, DTIC Document.
  43. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., and Poggio, T. (2007). Robust object recognition with cortexlike mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3):411-426.
  44. Tikhanoff, V., Cangelosi, A., Fitzpatrick, P., et al. (2008). An open-source simulator for cognitive robotics research: the prototype of the iCub humanoid robot simulator. In Proc. of the 8th workshop on performance metrics for intelligent systems, pages 57-61. ACM.
  45. Wang, Y. and Shi, B. E. (2011). Improved binocular vergence control via a neural network that maximizes an internally defined reward. IEEE Transactions on Autonomous Mental Development, 3(3):247-256.
Download


Paper Citation


in Harvard Style

Re D., Gibaldi A., P. Sabatini S. and W. Spratling M. (2017). An Integrated System based on Binocular Learned Receptive Fields for Saccade-vergence on Visually Salient Targets . In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 6: VISAPP, (VISIGRAPP 2017) ISBN 978-989-758-227-1, pages 204-215. DOI: 10.5220/0006124702040215


in Bibtex Style

@conference{visapp17,
author={Daniele Re and Agostino Gibaldi and Silvio P. Sabatini and Michael W. Spratling},
title={An Integrated System based on Binocular Learned Receptive Fields for Saccade-vergence on Visually Salient Targets},
booktitle={Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 6: VISAPP, (VISIGRAPP 2017)},
year={2017},
pages={204-215},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006124702040215},
isbn={978-989-758-227-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 6: VISAPP, (VISIGRAPP 2017)
TI - An Integrated System based on Binocular Learned Receptive Fields for Saccade-vergence on Visually Salient Targets
SN - 978-989-758-227-1
AU - Re D.
AU - Gibaldi A.
AU - P. Sabatini S.
AU - W. Spratling M.
PY - 2017
SP - 204
EP - 215
DO - 10.5220/0006124702040215