Reinforcement Learning based Video Summarization with Combination of ResNet and Gated Recurrent Unit

Muhammad Afzal, Muhammad Tahir

Abstract

Video cameras are getting ubiquitous with passage of time. Huge amount of video data is generated daily in this world that needs to be handled efficiently in limited storage and processing power. Video summarization renders the best way to quickly review over lengthy videos along with controlling storage and processing power requirements. Deep reinforcement-deep summarization network (DR-DSN) is a popular method for video summarization but performance of this method is limited and can be enhanced with better representation of video data. Most recently, it has been observed that deep residual networks are quite successful in many computer vision applications including video retrieval and captioning. In this paper, we have investigated deep feature representation for video summarization using deep residual network where ResNet 152 is being used to extract deep videos features. To speed up the model, long short term memory is replaced with gated recurrent unit, thus gave us flexibility to add another RNN layer which resulted in significant improvement in performance. With this combination of ResNet-152 and two layered gated recurrent unit (GRU), we performed experiments on SumMe video dataset and got results not only better than DR-DSN but also better than several state of art video summarization methods.

Download


Paper Citation


in Harvard Style

Afzal M. and Tahir M. (2021). Reinforcement Learning based Video Summarization with Combination of ResNet and Gated Recurrent Unit.In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, ISBN 978-989-758-488-6, pages 261-268. DOI: 10.5220/0010197402610268


in Bibtex Style

@conference{visapp21,
author={Muhammad Afzal and Muhammad Tahir},
title={Reinforcement Learning based Video Summarization with Combination of ResNet and Gated Recurrent Unit},
booktitle={Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,},
year={2021},
pages={261-268},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010197402610268},
isbn={978-989-758-488-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,
TI - Reinforcement Learning based Video Summarization with Combination of ResNet and Gated Recurrent Unit
SN - 978-989-758-488-6
AU - Afzal M.
AU - Tahir M.
PY - 2021
SP - 261
EP - 268
DO - 10.5220/0010197402610268