Audio-guided Video Interpolation via Human Pose Features

Takayuki Nakatsuka; Takayuki Nakatsuka; Masatoshi Hamanaka; Shigeo Morishima

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Audio-guided Video Interpolation via Human Pose Features

Topics: Cognitive Models for Interpretation, Integration and Control; Content-Based Indexing, Search, and Retrieval; Entertainment Imaging Applications

In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, 27-35, 2020 , Valletta, Malta

Authors: Takayuki Nakatsuka ^{1

;

2} ; Masatoshi Hamanaka ¹ and Shigeo Morishima ³

Affiliations: ¹ RIKEN, Japan ; ² Waseda University, Japan ; ³ Waseda Research Institute for Science and Engineering, Japan

Keyword(s): Video Interpolation, Pose Estimation, Signal Processing, Generative Adversarial Network, Gated Recurrent Unit.

Abstract: This paper describes a method that generates in-between frames of two videos of a musical instrument being played. While image generation achieves a successful outcome in recent years, there is ample scope for improvement in video generation. The keys to improving the quality of video generation are the high resolution and temporal coherence of videos. We solved these requirements by using not only visual information but also aural information. The critical point of our method is using two-dimensional pose features to generate high-resolution in-between frames from the input audio. We constructed a deep neural network with a recurrent structure for inferring pose features from the input audio and an encoder-decoder network for padding and generating video frames using pose features. Our method, moreover, adopted a fusion approach of generating, padding, and retrieving video frames to improve the output video. Pose features played an essential role in both end-to-end training with a d ifferentiable property and combining a generating, padding, and retrieving approach. We conducted a user study and confirmed that the proposed method is effective in generating interpolated videos. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 18.221.165.246

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Nakatsuka, T.; Hamanaka, M. and Morishima, S. (2020). Audio-guided Video Interpolation via Human Pose Features. In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 5: VISAPP; ISBN 978-989-758-402-2; ISSN 2184-4321, SciTePress, pages 27-35. DOI: 10.5220/0008876600270035

@conference{visapp20,
author={Takayuki Nakatsuka. and Masatoshi Hamanaka. and Shigeo Morishima.},
title={Audio-guided Video Interpolation via Human Pose Features},
booktitle={Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 5: VISAPP},
year={2020},
pages={27-35},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008876600270035},
isbn={978-989-758-402-2},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 5: VISAPP
TI - Audio-guided Video Interpolation via Human Pose Features
SN - 978-989-758-402-2
IS - 2184-4321
AU - Nakatsuka, T.
AU - Hamanaka, M.
AU - Morishima, S.
PY - 2020
SP - 27
EP - 35
DO - 10.5220/0008876600270035
PB - SciTePress