Marios Anthimopoulos, Basilis Gatos, Ioannis Pratikakis



This paper proposes an algorithm for detecting artificial text in video frames using edge information. First, an edge map is created using the Canny edge detector. Then, morphological dilation and opening are used in order to connect the vertical edges and eliminate false alarms. Bounding boxes are determined for every non-zero valued connected component, consisting the initial candidate text areas. Finally, an edge projection analysis is applied, refining the result and splitting text areas in text lines. The whole algorithm is applied in different resolutions to ensure text detection with size variability. Experimental results prove that the method is highly effective and efficient for artificial text detection.


  1. Canny J., 1986. A computational approach to edge detection, IEEE Trans. Pattern Analysis and Machine Intelligence, 8, 679-698.
  2. Chen D. , H. Bourlard, 2001. and J. -P. Thiran, Text Identification in Complex Background using SVM, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 621-626.
  3. Chen Datong, Kim Shearer and Herve Bourlard, 2003. Extraction of special effects caption text events from digital video, International Journal of Document Analysis and Recognition IJDAR(5), No. 2-3, pp. 138- 157
  4. Clark P. and M. Mirmehdi, 2000. Finding Text Regions Using Localised Measures, Proceedings of the 11th British Machine Vision Conference.
  5. Crandall David, Sameer Antani, Rangachar Kasturi, 2003. Extraction of special effects caption text events from digital video IJDAR(5), No. 2-3, pp. 138-157
  6. Du, Yingzi, Chang, Chein-I Thouin, Paul D., 2003. Automated system for text detection in individual video Images, Journal of Electronic Imaging, 12(3), 410 - 422.
  7. Gonzalez R. and R. Woods, 1992. Digital Image Processing Addison Wesley, pp 414 - 428
  8. Hua Xian-Sheng, Liu Wenyin, HongJiang Zhang, 2004. An automatic performance evaluation protocol for video text detection algorithms. IEEE Trans. Circuits Syst. Video Techn. 14(4), 498-507.
  9. Li Huiping , David Doermann, 2000. A Closed-Loop Training System for Video Text Detection, Cognitive and Neural Models for Word Recognition and Document Processing, World Scientific Press.
  10. Lienhart Rainer and Frank Stuber, 1995. Automatic text recognition in digital videos, Technical Report / Department for Mathematics and Computer Science, University of Mannheim.
  11. Malobabic J, O'Connor N, Murphy N, and Marlow S., 2004. Automatic Detection and Extraction of Artificial Text in Video, WIAMIS 2004 - 5th International Workshop on Image Analysis for Multimedia Interactive Services, Lisbon, Portugal.
  12. Manohar Vasant, Padmanabhan Soundararajan, Matthew Boonstra, Harish Raju, Dmitry B. Goldgof, Rangachar Kasturi, John S. Garofolo, 2006. Performance Evaluation of Text Detection and Tracking in Video. Document Analysis Systems, 576-587
  13. Rainer Lienhart and Axel Wernicke, 2002.Localizing and Segmenting Text in Images and Videos, IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, NO. 4
  14. Sato T. , Kanade T. , E. Hughes, and M. Smith , 1998. Video OCR for Digital News Archives, IEEE Workshop on Content-Based Access of Image and Video Databases(CAIVD'98), pp. 52 - 60.
  15. Sobottka K. and Bunke H., 1999. Identification of Text on Colored Book and Journal Covers, International Conference on Document Analysis and Recognition, Bangalore, India, pp. 57-62.
  16. Wolf Christian and Jean-Michel Jolion, 2004. Model based text detection in images and videos: a learning approach. Technical Report LIRIS-RR-2004-13 Laboratoire d'Informatique en Images et Systemes d'Information, INSA de Lyon, France. March 19th.
  17. Wu W., D. Chen and J. Yang, Integrating, 2005. CoTraining and Recognition for Text Detection, Proceedings of IEEE International Conference on Multimedia & Expo 2005 (ICME 2005), pp. 1166 - 1169.
  18. Xi Jie, Xian-Sheng Hua, Xiang-Rong Chen, 2001. Liu Wenyin, HongJiang Zhang, A Video Text Detection And Recognition System, IEEE International Conference on Multimedia and Expo ICME 2001.
  19. Yan Hao, Yi Zhang, Zengguang Hou, Min Tan, 2003. Automatic Text Detection In Video Frames Based on Bootstrap Artificial Neural Network and CED. The 11-th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision 2003.
  20. Ye Qixiang, Qingming Huang, Wen Gao, Debin Zhao, 2005. Fast and robust text detection in images and video frames. Image Vision Computing 23(6): 565- 576.
  21. Zhong Yu, HongJiang Zhang, Anil K. Jain, 2000. Automatic Caption Localization in Compressed Video, IEEE Trans. Pattern Analysis Machine Intelligence, 22(4): 385-392.

Paper Citation

in Harvard Style

Anthimopoulos M., Gatos B. and Pratikakis I. (2007). MULTIRESOLUTION TEXT DETECTION IN VIDEO FRAMES . In Proceedings of the Second International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, ISBN 978-972-8865-74-0, pages 161-166. DOI: 10.5220/0002057301610166

in Bibtex Style

author={Marios Anthimopoulos and Basilis Gatos and Ioannis Pratikakis},
booktitle={Proceedings of the Second International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP,},

in EndNote Style

JO - Proceedings of the Second International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP,
SN - 978-972-8865-74-0
AU - Anthimopoulos M.
AU - Gatos B.
AU - Pratikakis I.
PY - 2007
SP - 161
EP - 166
DO - 10.5220/0002057301610166