It is possible to scale up the scope of Snap & Hear 
project and consider the scenarios mentioned above 
in the discussion to the analysis parts to give a better 
experience to the end user by integrating those 
missing inputs to the story as well. 
REFERENCES 
Duc-Minh Pham, Trong-Nhan Dam-Nguyen, Phuc-Thinh 
Nguyen-Vo and Minh-Triet Tran, "Smart Teddy Bear a 
vision-based story teller", 2013 International 
Conference on Control, Automation and Information 
Sciences (ICCAIS), 2013. Available: 10.1109/iccais. 
2013.6720564 
C. Rigaud, "Segmentation and indexation of complex 
objects in comic book", ELCVIA Electronic Letters on 
Computer Vision and Image Analysis, vol. 14, no. 3, 
2016. Available: 10.5565/rev/elcvia.833 
C. Ponsard and V. Fries, "Enhancing the Accessibility for 
All of Digital Comic Books", vol. I, no. 5, 2009. 
Available: http://www.eminds.hci-rg.com. 
Phaisarn Sutheebanjard & Wichian Premchaiswadi, “A 
Modified Recursive X-Y Cut Algorithm for Solving 
Block Ordering Problems”, 2010 2nd International 
Conference on Computer Engineering and Technology, 
Available: v3-307. 
Xufang Pang, Ying Cao, Rynson W.H. Lau, and Antoni B. 
Chan, “A Robust Panel Extraction Method for Manga”, 
2014. 
N. Nguyen, C. Rigaud and J. Burie, "Multi-task Model for 
Comic Book Image Analysis", MultiMedia Modeling, 
pp. 637-649, 2018. 
N. Nguyen, C. Rigaud and J. Burie, "Digital Comics Image 
Indexing Based on Deep Learning", Journal of 
Imaging, vol. 4, no. 7, p. 89, 2018. 
T. Ogawa, A. Otsubo, R. Narita, Y. Matsui, T. Yamasaki 
and K. Aizawa, "Object Detection for Comics using 
Manga109 Annotations", Research Gate, 2018. 
Available: https://www.researchgate.net/publication/ 
324005785_Object_Detection_for_Comics_using_Ma
nga109_Annotations/citations. 
K. Ahmed Siddiqui, "Skin Detection Of Animation 
Characters", International Journal on Soft Computing, 
vol. 6, no. 1, pp. 37-52, 2015.  
M. Shejwal and S. Bharkad, "Segmentation and extraction 
of text from curved text lines using image processing 
approach",  2017 International Conference on 
Information, Communication, Instrumentation and 
Control (ICICIC), 2017. 
S. Muhammad Arsalan Bashir, "Font Acknowledgment and 
Character Extraction of Digital and Scanned Images", 
International Journal of Computer Applications, vol. 
70, no. 8, pp. 1-5, 2013. 
H. Tolle and K. Arai, “Method for Real Time Text 
Extraction of Digital Manga Comic,” International 
Journal of Image Processing (IJIP), vol. 4, no. 6, pp. 
669–676, 2011. 
A. K. N. Ho, J. C. Burie and J.M. Ogier, “Panel and Speech 
Balloon Extraction from Comic Books,” presented at 
Tenth IAPR International Workshop on Document 
Analysis Systems, pp. 424-428, Mar. 2012. 
C. Rigaud, J. C. Burie and J.M. Ogier, “Text-Independent 
Speech Balloon Segmentation for Comics and Manga,” 
2017, pp. 133-147. 
C. Rigaud, J. C. Burie, J.M. Ogier, D. Karatzas and Jo, “An 
Active Contour Model for Speech Balloon Detection in 
Comics,” in Proceedings of the International 
Conference on Document Analysis and Recognition, 
ICDAR 2013 
Rigaud, C., Thanh, N.L.; Burie, J.C.; Ogier, J.M.; Iwata, 
M.; Imazu, E.; Kise, K. “Speech balloon and speaker 
association for comics and manga understanding,” in 
Proceedings of the 2015 13th International Conference 
on Document Analysis and Recognition (ICDAR), 
Tunis, Tunisia, Aug. 23–26, 2015; pp. 351–355. 
Xin Wang, Wenhu Chen, Yuan-Fang Wang and William 
Yang Wang, “No Metrics Are Perfect: Adversarial 
Reward Learning for Visual Storytelling.” ACL 2018. 
Wang, Jing, Jianlong Fu, Jinhui Tang, Zechao Li and Tao 
Mei, “Show, Reward and Tell: Automatic Generation 
of Narrative Paragraph From Photo Stream by 
Adversarial Training,” AAAI 2018. 
Y. Liu, J. Fu, C. W. Chen, “Let Your Photos Talk: 
Generating Narrative Paragraph for Photo Stream via 
Bidirectional Attention Recurrent Neural Networks,” 
AAAI Conference on Artificial Intelligence. 
Ajinkya Domale, Bhimsen Padalkar, Raj Parekh, M.A. 
Joshi, “Printed book to audio book converter for 
visually impaired”,2013 Texas Instruments India 
Educators' Conference. 
Mishra, Taniya & Greene, Erica & Conkie, Alistair. (2012). 
Predicting Character-Appropriate Voices for a TTS-
based Storyteller System. 13th Annual Conference of 
the International Speech Communication Association 
2012, INTERSPEECH 2012. 3.