A Benchmark of Computational Models of Saliency to Predict Human Fixations in Videos

Shoaib Azam; Syed Omer Gilani; Moongu Jeon; Rehan Yousaf; Jeong Bae Kim

doi:10.5220/0005678701340142

A Benchmark of Computational Models of Saliency to Predict Human Fixations in Videos

Shoaib Azam, Syed Omer Gilani, Moongu Jeon, Rehan Yousaf, Jeong Bae Kim

2016

Abstract

In many applications of computer graphics and design, robotics and computer vision, there is always a need to predict where human looks in the scene. However this is still a challenging task that how human visual system certainly works. A number of computational models have been designed using different approaches to estimate the human visual system. Most of these models have been tested on images and performance is calculated on this basis. A benchmark is made using images to see the immediate comparison between the models. Apart from that there is no benchmark on videos, to alleviate this problem we have a created a benchmark of six computational models implemented on 12 videos which have been viewed by 15 observers in a free viewing task. Further a weighted theory (both manual and automatic) is designed and implemented on videos using these six models which improved Area under the ROC. We have found that Graph Based Visual Saliency (GBVS) and Random Centre Surround Models have outperformed the other models.

References

Ali Borji, Member, IEEE, Dicky N. Sihite, and Laurent Itti, Member, IEEE, 2013(a). Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study, In IEEE Trans. Image Processing, VOL. 22, NO. 1.2013.
Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti, 2013(b). Analysis of scores, datasets, and models in visual saliency prediction, In ICCV 2013.
Ali Borji, Member, IEEE, and Laurent Itti, Member, IEEE, 2010. State-of-the-art in Visual Attention Modeling, In IEEE Trans. Pattern Anal. Mach. Intell, 2010.
C. Guo, Qi Ma and Liming Zhang, 2008.Spatio-temporal Saliency detection using phase spectrum of quaternion Fourier transforms. In CVPR, 2008.
C. Koch and S. Ullman, 1985. Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology, 4:219-227, 1985.
C. M. Privitera and Lawrence W. Stark, 2000. Algorithms for defining visual regions-of-interest: Comparison with eye fixations. IEEE Trans. Pattern Anal. Mach. Intell., 22:970-982, Sep' 2000.
D. Parkhurst, Klinton Law, and Ernst Niebur, 2002. Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42(1):107 - 123, 2002.
H. Hadizadeh, M. J. Enriquez, and I. V. Bajic, 2012.Eyetracking database for a set of standard video sequences," IEEE Trans. Image Processing, vol. 21, no. 2, pp. 898-903, Feb. 2012.
J. Harel, C. Koch, and P. Perona 2007. Graph based visual saliency, In Advances in Neural Information Processing Systems. MIT Press.
J. M. Henderson, J. R. Brockmole, M. S. Castelhano, and M. Mack, 2007. Visual saliency does not account for eye movements during visual search in real-world scenes. Eye Movement Research: Insights into Mind and Brain, 2007.
L. Elazary and L. Itti, 2008. Interesting objects are visually salient. J. Vis., 8(3):1-15, 3 2008.
L. Itti, 2005. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition, 12:1093-1123, 2005.
Nevrez Imamoglu, Weisi Lin, and Yuming Fang, 2013. A Saliency Detection Model Using Low-Level Features Based on Wavelet Transform. IEEE Trans. Multimedia 15(1): 96-105 (January 2013).
N. Bruce and J. Tsotsos, 2006. Saliency based on information maximization. In Y. Weiss, B. Scholkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 155-162. MIT Press, Cambridge, MA, 2006.
R. Achanta, F. Estrada, P. Wils and S. Süsstrunk, 2008. Salient Region Detection and Segmentation, In ICVS, 2008, Vol. 5008, Springer Lecture Notes in Computer Science, pp. 66-75, 2008.
Stas Goferman, Lihi Zelnik-Manor, and Ayellet Tal, 2010.Context aware saliency detection. In CVPR'10, pages 2376-2383, 2010.
Tilke Judd, FrØdo Durand, and Antonio Torralba, 2012. A Benchmark of Computational Models of Saliency to Predict Human Fixations. MIT Computer Science and Artificial Intelligence Laboratory Technical Report, January 13, 2012.
Vikram T. N., Tscherepanow M. and Wrede B., 2012. A saliency map based on sampling an image into random rectangular regions of interest, In Pattern Recognition (2012).
Yubing Tong, Faouzi Alaya Cheikh, Fahad Fazal Elahi Guraya, Hubert Konik and Alain Tremeau, 2011. A Spatiotemporal Saliency Model for Video Surveillance. 3(1):241-263, Journal of Cognitive Computing. Springer.
Yubing Tong, Faouzi Alaya Cheikh, Hubert Konik and Alain Tremeau, 2010. Full reference image quality assessment based on saliency map analysis (Journal of Imaging Science and Technology, 54(3):030503- 030514, 2010.
Young, Richard (1987). The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spatial Vision 2 (4): 273-293(21).

Download

Paper Citation

in Harvard Style

Azam S., Gilani S., Jeon M., Yousaf R. and Kim J. (2016). A Benchmark of Computational Models of Saliency to Predict Human Fixations in Videos . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 134-142. DOI: 10.5220/0005678701340142

in Bibtex Style

@conference{visapp16,
author={Shoaib Azam and Syed Omer Gilani and Moongu Jeon and Rehan Yousaf and Jeong Bae Kim},
title={A Benchmark of Computational Models of Saliency to Predict Human Fixations in Videos},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={134-142},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005678701340142},
isbn={978-989-758-175-5},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - A Benchmark of Computational Models of Saliency to Predict Human Fixations in Videos
SN - 978-989-758-175-5
AU - Azam S.
AU - Gilani S.
AU - Jeon M.
AU - Yousaf R.
AU - Kim J.
PY - 2016
SP - 134
EP - 142
DO - 10.5220/0005678701340142