Image Semantic Distance Metric Learning Approach for Large-scale Automatic Image Annotation

Cong Jin, Shu-Wei Jin


Learning an effective semantic distance measure is very important for the practical application of image analysis and pattern recognition. Automatic image annotation (AIA) is a task of assigning one or more semantic concepts to a given image and a promising way to achieve more effective image retrieval and analysis. Due to the semantic gap between low-level visual features and high-level image semantic, the performances of some image distance metric learning (IDML) algorithms only using low-level visual features is not satisfactory. Since there is the diversity and complexity of large-scale image dataset, only using visual similarity to learn image distance is not enough. To solve this problem, in this paper, the semantic labels of the training image set participate into the image distance measure learning. The experimental results confirm that the proposed image semantic distance metric learning (ISDML) can improve the efficiency of large-scale AIA approach and achieve better annotation performance than the other state-of-the art AIA approaches.


  1. Chen, G., Song, Y., Wang, F., Zhang C., 2008. Semisupervised multilabel learning by solving a sylvester equation. SIAM International Conference on Data Mining, 410-419
  2. Chua, T.S., Tang, J., Hong, R., et al., (2009). NUS-WIDE: a real-world web image database from National University of Singapore. ACM International Conference on Image and Video Retrieval, 48
  3. Feng, S.L., Manmatha, R., Lavrenko, V., 2004. Multiple bernoulli relevance models for image and video annotation, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), II-1002-II-1009, Vol.2, 1002-1009
  4. Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R., 2005. Neighbourhood components analysis. Advances in Neural Information Processing Systems, 17, 103- 110
  5. Grubinger, M., 2007. Analysis and evaluation of visual information systems performance. PhD thesis, Victoria University, Melbourne, Australia
  6. Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C., 2009. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation, IEEE 12th International Conference on Computer Vision. 309-316
  7. Jin, C., Guo, J.L., 2014. Image semantic annotation approach based on the feature matching. Springer, Advances in Intelligent Systems and Computing, Vol.250, 281-288
  8. Jin, C. Jin, S.W., 2015. Automatic image annotation using feature selection based on improving quantum particle swarm optimization. Signal Processing, 109, 172-181
  9. Jin, C., Liu, J.A., Guo, J.L., 2015. A hybrid model based on mutual information and support vector machine for automatic image annotation. Artificial Intelligence Perspectives and Applications. Springer, 347, 29-38
  10. Lasmar, N.E., Berthoumieu, Y., 2014. Gaussian copula multivariate modeling for texture image retrieval using wavelet transforms. IEEE Transactions on Image Processing, 23(5), 2246-2261
  11. Liu, S., Yan, S.C., Zhang, T.Z., Xu, C.S., Liu, J., Lu, H.Q., 2012. Weakly supervised graph propagation towards collective image parsing, IEEE Transactions on Multimedia, 14(2), 361-373
  12. Makadia, A., Pavlovic, V., Kumar, S., 2008. A new baseline for image annotation. Computer VisionECCV 2008. Springer Berlin Heidelberg, 316-329
  13. Nakayama, H., 2011. Linear distance metric learning for large-scale generic image recognition. PhD thesis, The University of Tokyo, Japan
  14. Nguyen, C.T., Kaothanthong, N., Tokuyama, T., Phan, X.H., 2013. A feature-word-topic model for image annotation and retrieval. ACM Transactions on the Web, 7(3), 1-12
  15. Rahmani, R., Goldman, S., 2006. Missl: Multiple-instance semi-supervised learning, International Conference on Machine Learning, 705-712
  16. Shi, J., Malik, J., 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888-905
  17. Von Ahn, L., Dabbis, L., 2004. Labeling images with a computer game. SIGCHI Conference on Human Factors in Computing Systems. ACM, 319-326
  18. Wang, C., Zhang, L., Zhang, H.J., 2008. Learning to reduce the semantic gap in web image retrieval and annotation. The 31st International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, 355-362
  19. Watcharapinchai, N., Aramvith, S., Siddhichai, S., 2011. Two-probabilistic latent semantic model for image annotation and retrieval, Lecture Notes in Computer Science, vol.6468, 359-369
  20. Yashaswi, V., Jawahar, C.V., 2012. Image annotation using metric learning in semantic neighbourhoods. ECCV(3), 836-849
  21. Zhou, D., Bousquet, O., Lal, T.N., et al., 2004. Learning with local and global consistency. Advances in Neural Information Processing Systems, 16(16), 321- 328
  22. Zhuang, Y., Liu, X., Pan, Y., 1999. Apply semantic template to support content-based image retrieval. Lecture Notes in Computer Science. 3972, 442-449

Paper Citation

in Harvard Style

Jin C. and Jin S. (2016). Image Semantic Distance Metric Learning Approach for Large-scale Automatic Image Annotation . In Proceedings of the International Conference on Internet of Things and Big Data - Volume 1: IoTBD, ISBN 978-989-758-183-0, pages 277-283. DOI: 10.5220/0005729902770283

in Bibtex Style

author={Cong Jin and Shu-Wei Jin},
title={Image Semantic Distance Metric Learning Approach for Large-scale Automatic Image Annotation},
booktitle={Proceedings of the International Conference on Internet of Things and Big Data - Volume 1: IoTBD,},

in EndNote Style

JO - Proceedings of the International Conference on Internet of Things and Big Data - Volume 1: IoTBD,
TI - Image Semantic Distance Metric Learning Approach for Large-scale Automatic Image Annotation
SN - 978-989-758-183-0
AU - Jin C.
AU - Jin S.
PY - 2016
SP - 277
EP - 283
DO - 10.5220/0005729902770283