An Anthropomorphic Perspective for Audiovisual Speech Synthesis

Samuel Silva, António Teixeira

2017

Abstract

In speech communication, both the auditory and visual streams play an important role, ensuring both a certain level of redundancy (e.g., lip movement) and transmission of complementary information (e.g., to emphasize a word). The common current approach to audiovisual speech synthesis, generally based on data-driven methods, yields good results, but relies on models controlled by parameters that do not relate with how humans do it, being hard to interpret and adding little to our understanding of the human speech production apparatus. Modelling the actual system, adopting an anthropomorphic perspective would provide a myriad of novel research paths. This article proposes a conceptual framework to support research and development of an articulatory-based audiovisual speech synthesis system. The core idea is that the speech production system is modelled to produce articulatory parameters with anthropomorphic meaning (e.g., lip opening) driving the synthesis of both the auditory and visual streams. A first instantiation of the framework for European Portuguese illustrates its viability and constitutes an important tool for research in speech production and the deployment of audiovisual speech synthesis in multimodal interaction scenarios, of the utmost relevance for the current and future complex services and applications.

Download


Paper Citation


in Harvard Style

Silva S. and Teixeira A. (2017). An Anthropomorphic Perspective for Audiovisual Speech Synthesis. In - BIOSIGNALS, (BIOSTEC 2017) ISBN , pages 0-0. DOI: 10.5220/0006150200001488


in Bibtex Style

@conference{biosignals17,
author={Samuel Silva and António Teixeira},
title={An Anthropomorphic Perspective for Audiovisual Speech Synthesis},
booktitle={ - BIOSIGNALS, (BIOSTEC 2017)},
year={2017},
pages={},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006150200001488},
isbn={},
}


in EndNote Style

TY - CONF

JO - - BIOSIGNALS, (BIOSTEC 2017)
TI - An Anthropomorphic Perspective for Audiovisual Speech Synthesis
SN -
AU - Silva S.
AU - Teixeira A.
PY - 2017
SP - 0
EP - 0
DO - 10.5220/0006150200001488