Authors:
Gabriela Vozáriková
;
Richard Staňa
and
Gabriel Semanišin
Affiliation:
Institute of Computer Science, Pavol Jozef Šafárik University in Košice, Jesenná 5, Košice, Slovakia
Keyword(s):
U-Net, Clothing Parsing, Segmentation, Computer Vision, Multitask Learning, Deep Learning, Fully-convolutional Network.
Abstract:
This paper focuses on the task of clothing parsing, which is a special case of the more general object segmentation task well known in the field of computer vision. Each pixel is to be assigned to one of the clothing categories or background. Due to complexity of the problem and lack of data (until recently) performance of the modern state-of-the-art clothing parsing models expressed in terms of mean Intersection over Union metric (IoU) does not exceed 55%. In this paper, we propose a novel multitask network by extending fully-convolutional neural network U-Net with two side branches – one solves a multilabel classification task and the other predicts bounding boxes of clothing instances. We trained this network using a large-scaled iMaterialist dataset (Visipedia, 2019), which we refined. Compared to well performing segmentation architectures FPN, DeepLabV3, DeepLabV3+ and plain U-Net, our model achieves the best experimental results.