Active Learning in Social Context for Image Classification

Elisavet Chatzilari, Spiros Nikolopoulos, Yiannis Kompatsiaris, Josef Kittler

2014

Abstract

Motivated by the widespread adoption of social networks and the abundant availability of user-generated multimedia content, our purpose in this work is to investigate how the known principles of active learning for image classification fit in this newly developed context. The process of active learning can be fully automated in this social context by replacing the human oracle with the user tagged images obtained from social networks. However, the noisy nature of user-contributed tags adds further complexity to the problem of sample selection since, apart from their informativeness, our confidence about their actual content should be also maximized. The contribution of this work is on proposing a probabilistic approach for jointly maximizing the two aforementioned quantities with a view to automate the process of active learning. Experimental results show the superiority of the proposed method against various baselines and verify the assumption that significant performance improvement cannot be achieved unless we jointly consider the samples’ informativeness and the oracle’s confidence.

References

  1. Campbell, C., Cristianini, N., and Smola, A. J. (2000). Query learning with large margin classifiers. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML 7800, pages 111-118, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
  2. Chatfield, K., Lempitsky, V., Vedaldi, A., and Zisserman, A. (2011). The devil is in the details: an evaluation of recent feature encoding methods. In British Machine Vision Conference.
  3. Chatzilari, E., Nikolopoulos, S., Kompatsiaris, Y., and Kittler, J. (2012). Multi-modal region selection approach for training object detectors. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR 7812, pages 5:1-5:8, New York, NY, USA. ACM.
  4. Cohn, D., Atlas, L., and Ladner, R. (1994). Improving generalization with active learning. Mach. Learn., 15(2):201-221.
  5. Fang, M. and Zhu, X. (2012). I don't know the label: Active learning with blind knowledge. In Pattern Recognition (ICPR), 2012 21st International Conference on, pages 2238-2241.
  6. Freytag, A., Rodner, E., Bodesheim, P., and Denzler, J. (2013). Labeling examples that matter: Relevancebased active learning with gaussian processes. In Weickert, J., Hein, M., and Schiele, B., editors, GCPR, volume 8142 of Lecture Notes in Computer Science, pages 282-291. Springer.
  7. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Ndellec, C. and Rouveirol, C., editors, Machine Learning: ECML-98, volume 1398 of Lecture Notes in Computer Science, pages 137-142. Springer Berlin Heidelberg.
  8. Li, X., Snoek, C. G. M., Worring, M., Koelma, D. C., and Smeulders, A. W. M. (2013). Bootstrapping visual categorization with relevant negatives. IEEE Transactions on Multimedia, In press.
  9. Lin, H.-T., Lin, C.-J., and Weng, R. C. (2007). A note on platt's probabilistic outputs for support vector machines. Machine Learning, 68(3):267-276.
  10. Mark J. Huiskes, B. T. and Lew, M. S. (2010). New trends and ideas in visual concept detection: The mir flickr retrieval evaluation initiative. In MIR 7810: Proceedings of the 2010 ACM International Conference on Multimedia Information Retrieval, pages 527-536, New York, NY, USA. ACM.
  11. Ng, V. and Cardie, C. (2003). Bootstrapping coreference classifiers with multiple machine learning algorithms. In Proceedings of the 2003 conference on Empirical methods in natural language processing, EMNLP 7803, pages 113-120.
  12. Nowak, S. and Rüger, S. (2010). How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In Proceedings of the international conference on Multimedia information retrieval, MIR 7810, pages 557-566, New York, NY, USA. ACM.
  13. Perronnin, F., Sánchez, J., and Mensink, T. (2010). Improving the fisher kernel for large-scale image classification. In Proceedings of the 11th European conference on Computer vision: Part IV, ECCV'10, pages 143- 156. Springer-Verlag.
  14. Platt, J. C. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS, pages 61-74. MIT Press.
  15. Raykar, V. C., Yu, S., Zhao, L. H., Valadez, G. H., Florin, C., Bogoni, L., and Moy, L. (2010). Learning from crowds. J. Mach. Learn. Res., 11:1297-1322.
  16. Schohn, G. and Cohn, D. (2000). Less is more: Active learning with support vector machines. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML 7800, pages 839-846, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
  17. Settles, B. (2009). Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison.
  18. Smucker, M. D., Allan, J., and Carterette, B. (2007). A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, CIKM 7807, pages 623- 632.
  19. Thomee, B. and Popescu, A. (2012). Overview of the clef 2012 flickr photo annotation and retrieval task. in the working notes for the clef 2012 labs and workshop. Rome, Italy.
  20. Tong, S. and Chang, E. (2001). Support vector machine active learning for image retrieval. In Proceedings of the ninth ACM international conference on Multimedia, MULTIMEDIA 7801, pages 107-118, New York, NY, USA. ACM.
  21. Uricchio, T., Ballan, L., Bertini, M., and Del Bimbo, A. (2013). An evaluation of nearest-neighbor methods for tag refinement.
  22. Vedaldi, A. and Fulkerson, B. (2008). VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/.
  23. Verma, Y. and Jawahar, C. V. (2012). Image annotation using metric learning in semantic neighbourhoods. In Proceedings of the 12th European conference on Computer Vision - Volume Part III, ECCV'12, pages 836-849.
  24. Verma, Y. and Jawahar, C. V. (2013). Exploring svm for image annotation in presence of confusing labels. In Proceedings of the 24th British Machine Vision Conference, BMVC'13.
  25. Vijayanarasimhan, S. and Grauman, K. (2011). Large-scale live active learning: Training object detectors with crawled data and crowds. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1449 -1456.
  26. Wang, M. and Hua, X.-S. (2011). Active learning in multimedia annotation and retrieval: A survey. ACM Trans. Intell. Syst. Technol., 2(2):10:1-10:21.
  27. Yan, Y., Rosales, R., Fung, G., and Dy, J. (2011). Active learning from crowds. In Getoor, L. and Scheffer, T., editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML 7811, pages 1161-1168, New York, NY, USA. ACM.
  28. Yan, Y., Rosales, R., Fung, G., Schmidt, M., Hermosillo, G., Bogoni, L., Moy, L., and Dy, J. (2010). Modeling annotator expertise: Learning when everybody knows a bit of something.
  29. Zhang, L., Ma, J., Cui, C., and Li, P. (2011). Active learning through notes data in flickr: an effortless training data acquisition approach for object localization. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR 7811, pages 46:1-46:8, New York, NY, USA. ACM.
Download


Paper Citation


in Harvard Style

Chatzilari E., Nikolopoulos S., Kompatsiaris Y. and Kittler J. (2014). Active Learning in Social Context for Image Classification . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 76-85. DOI: 10.5220/0004686400760085


in Bibtex Style

@conference{visapp14,
author={Elisavet Chatzilari and Spiros Nikolopoulos and Yiannis Kompatsiaris and Josef Kittler},
title={Active Learning in Social Context for Image Classification},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={76-85},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004686400760085},
isbn={978-989-758-004-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Active Learning in Social Context for Image Classification
SN - 978-989-758-004-8
AU - Chatzilari E.
AU - Nikolopoulos S.
AU - Kompatsiaris Y.
AU - Kittler J.
PY - 2014
SP - 76
EP - 85
DO - 10.5220/0004686400760085