Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN

Ala Mhalla, Thierry Chateau, Sami Gazzah, Najoua Essoukri Ben Amara

Abstract

The performance of a generic pedestrian detector decreases significantly when it is applied to a specific scene due to the large variation between the source dataset used to train the generic detector and samples in the target scene. In this paper, we suggest a new approach to automatically specialize a scene-specific pedestrian detector starting with a generic detector in video surveillance without further manually labeling any samples under a novel transfer learning framework. The main idea is to consider a deep detector as a function that generates realizations from the probability distribution of the pedestrian to be detected in the target. Our contribution is to approximate this target probability distribution with a set of samples and an associated specialized deep detector estimated in a sequential Monte Carlo filter framework. The effectiveness of the proposed framework is demonstrated through experiments on two public surveillance datasets. Compared with a generic pedestrian detector and the state-of-the-art methods, our proposed framework presents encouraging results.

References

  1. Duan, L., Tsang, I. W., Xu, D., and Maybank, S. J. (2009). Domain transfer svm for video concept detection. In CVPR, pages 1375-1381. IEEE.
  2. Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. IJCV.
  3. Glorot, X., Bordes, A., and Bengio, Y. (2011). Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 513-520.
  4. Goodfellow, I. J., Courville, A., and Bengio, Y. (2012). Spike-and-slab sparse coding for unsupervised feature discovery. arXiv.
  5. Guyon, I., Dror, G., Lemaire, V., Taylor, G., and Aha, D. W. (2011). Unsupervised and transfer learning challenge. In IJCNN, pages 793-800. IEEE.
  6. Htike, K. K. and Hogg, D. C. (2014). Efficient non-iterative domain adaptation of pedestrian detectors to video scenes. In 2014 22nd International Conference on Pattern Recognition (ICPR), pages 654-659. IEEE.
  7. Huang, G. B., Lee, H., and Learned-Miller, E. (2012). Learning hierarchical representations for face verification with convolutional deep belief networks. In CVPR, pages 2518-2525. IEEE.
  8. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In ACM, pages 675-678. ACM.
  9. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, pages 2278-2324.
  10. Li, X., Ye, M., Fu, M., Xu, P., and Li, T. (2015). Domain adaption of vehicle detector based on convolutional neural networks. International Journal of Control, Automation and Systems, pages 1020-1031.
  11. Maa╦ćmatou, H., Chateau, T., Gazzah, S., Goyat, Y., and Essoukri Ben Amara, N. (2016). Transductive transfer learning to specialize a generic classifier towards a specific scene. In VISAPP, pages 411-422.
  12. Mesnil, G., Dauphin, Y., Glorot, X., Rifai, S., Bengio, Y., Goodfellow, I. J., Lavoie, E., Muller, X., Desjardins, G., Warde-Farley, D., et al. (2012). Unsupervised and transfer learning challenge: a deep learning approach. ICML Unsupervised and Transfer Learning, pages 97-110.
  13. Nair, V. and Clark, J. J. (2004). An unsupervised, online learning framework for moving object detection. In CVPR, pages II-317. IEEE.
  14. Ren, S., He, K., Girshick, R. B., and Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. CoRR.
  15. Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  16. Smith, A., Doucet, A., de Freitas, N., and Gordon, N. (2013). Sequential Monte Carlo methods in practice. Springer Science & Business Media.
  17. Taigman, Y., Yang, M., Ranzato, M. A., and Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In CVPR, pages 1701-1708.
  18. Wang, M., Li, W., and Wang, X. (2012). Transferring a generic pedestrian detector towards specific scenes. In CVPR, pages 3274-3281. IEEE.
  19. Wang, X., Ma, X., and Grimson, W. E. L. (2009). Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. PAMI, pages 539-555.
  20. Wang, X., Wang, M., and Li, W. (2014). Scene-specific pedestrian detection for static video surveillance. PAMI, pages 361-362.
  21. Will Y. Zou, Serena Y. Yeung, A. Y. N. (2011). Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CSD, pages 3361-3368.
  22. Yunxiang Mao, Z. Y. (2015). Training a scene-specific pedestrian detector using tracklets. pages 170-176.
  23. Zeng, X., Ouyang, W., Wang, M., and Wang, X. (2014). Deep learning of scene-specific classifier for pedestrian detection. In ECCV, pages 472-487. Springer.
Download


Paper Citation


in Harvard Style

Mhalla A., Chateau T., Gazzah S. and Essoukri Ben Amara N. (2017). Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN . In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2017) ISBN 978-989-758-225-7, pages 17-23. DOI: 10.5220/0006097900170023


in Bibtex Style

@conference{visapp17,
author={Ala Mhalla and Thierry Chateau and Sami Gazzah and Najoua Essoukri Ben Amara},
title={Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN},
booktitle={Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2017)},
year={2017},
pages={17-23},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006097900170023},
isbn={978-989-758-225-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2017)
TI - Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN
SN - 978-989-758-225-7
AU - Mhalla A.
AU - Chateau T.
AU - Gazzah S.
AU - Essoukri Ben Amara N.
PY - 2017
SP - 17
EP - 23
DO - 10.5220/0006097900170023