3D Hand and Object Pose Estimation for Real-time Human-robot Interaction

Chaitanya Bandi, Hannes Kisner, Urike Thomas

2022

Abstract

Estimating 3D hand pose and object pose in real-time is essential for human-robot interaction scenarios like handover of objects. Particularly in handover scenarios, many challenges need to be faced such as mutual hand-object occlusions and the inference speed to enhance the reactiveness of robots. In this paper, we present an approach to estimate 3D hand pose and object pose in real-time using a low-cost consumer RGB-D camera for human-robot interaction scenarios. We propose a cascade of networks strategy to regress 2D and 3D pose features. The first network detects the objects and hands in images. The second network is an end-to-end model with independent weights to regress 2D keypoints of hands joints and object corners, followed by a 3D wrist centric hand and object pose regression using a novel residual graph regression network and finally a perspective-n-point approach to solve 6D pose of detected objects in hand. To train and evaluate our model, we also propose a small-scale 3D hand pose dataset with a new semi-automated annotation approach using a robot arm and demonstrate the generalizability of our model on the state-of-the-art benchmarks.

Download


Paper Citation


in Harvard Style

Bandi C., Kisner H. and Thomas U. (2022). 3D Hand and Object Pose Estimation for Real-time Human-robot Interaction. In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 4: VISAPP; ISBN 978-989-758-555-5, SciTePress, pages 770-780. DOI: 10.5220/0010902400003124


in Bibtex Style

@conference{visapp22,
author={Chaitanya Bandi and Hannes Kisner and Urike Thomas},
title={3D Hand and Object Pose Estimation for Real-time Human-robot Interaction},
booktitle={Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 4: VISAPP},
year={2022},
pages={770-780},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010902400003124},
isbn={978-989-758-555-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 4: VISAPP
TI - 3D Hand and Object Pose Estimation for Real-time Human-robot Interaction
SN - 978-989-758-555-5
AU - Bandi C.
AU - Kisner H.
AU - Thomas U.
PY - 2022
SP - 770
EP - 780
DO - 10.5220/0010902400003124
PB - SciTePress