Authors:
Haruya Ishikawa
1
;
Masaki Hayashi
1
;
Trong Huy Phan
2
;
Kazuma Yamamoto
2
;
Makoto Masuda
2
and
Yoshimitsu Aoki
1
Affiliations:
1
Department of Electrical Engineering, Keio University, Yokohama, Japan
;
2
OKI Electric Industry Co., Ltd., Saitama, Japan
Keyword(s):
Multi-Object Tracking, Person Re-Identification, Video Re-Identification, Metric Learning.
Abstract:
Person re-identification is a vital module of the tracking-by-detection framework for online multi-object tracking. Despite recent advances in multi-object tracking and person re-identification, inadequate attention was given to integrating these technologies to provide a robust multi-object tracker. In this work, we combine modern state-of-the-art re-identification models and modeling techniques on the basic tracking-by-detection framework and benchmark them on heavily occluded scenes to understand their effect. We hypothesize that temporal modeling for re-identification is crucial for training robust re-identification models for they are conditioned on sequences containing occlusions. Along with traditional image-based re-identification methods, we analyze temporal modeling methods used in video-based re-identification tasks. We also train re-identification models with different embedding methods, including triplet loss, and analyze their effect. We benchmark the re-identification
models on the challenging MOT20 dataset containing crowded scenes with various occlusions. We provide a thorough assessment and investigation of the usage of modern re-identification modeling methods and prove that these methods are, in fact, effective for multi-object tracking. Compared to baseline methods, results show that these models can provide robust re-identification proved by improvements in the number of identity switching, MOTA, IDF1, and other metrics.
(More)