MA-ResNet50: A General Encoder Network for Video Segmentation

Xiaotian Liu, Lei Yang, Xiaoyu Zhang, Xiaohui Duan

2022

Abstract

To improve the performance of segmentation networks on video streaming, most researchers now use optical-flow based method and non optical-flow CNN based method. The former suffers from heavy computational cost and high latency while the latter suffers from poor applicability and versatility. In this paper, we design a Partial Channel Memory Attention module (PCMA) to store and fuse time series features from video sequences.Then, we propose a Memory Attention ResNet50 network (MA-ResNet50) by combining the PCMA module with ResNet50, making it the first video based feature extraction encoder appliable for most of the currently proposed segmentation networks. For experiments, we combine our MA-ResNet50 with four acknowledged per-frame segmentation networks: DeeplabV3P, PSPNet, SFNet, and DNLNet. The results show that our MA-ResNet50 outperforms the original ResNet50 generally in these 4 networks on VSPW and CamVid. Our method also achieves state-of-the-art accuracy on CamVid. The code is avilable at https://github.com/xiaotianliu01/MA-Resnet50.

Download


Paper Citation


in Harvard Style

Liu X., Yang L., Zhang X. and Duan X. (2022). MA-ResNet50: A General Encoder Network for Video Segmentation. In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, ISBN 978-989-758-555-5, pages 79-86. DOI: 10.5220/0010800800003124


in Bibtex Style

@conference{visapp22,
author={Xiaotian Liu and Lei Yang and Xiaoyu Zhang and Xiaohui Duan},
title={MA-ResNet50: A General Encoder Network for Video Segmentation},
booktitle={Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,},
year={2022},
pages={79-86},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010800800003124},
isbn={978-989-758-555-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,
TI - MA-ResNet50: A General Encoder Network for Video Segmentation
SN - 978-989-758-555-5
AU - Liu X.
AU - Yang L.
AU - Zhang X.
AU - Duan X.
PY - 2022
SP - 79
EP - 86
DO - 10.5220/0010800800003124