PHONEME-TO-VISEME MAPPING FOR VISUAL SPEECH RECOGNITION

Luca Cappelletta; Naomi Harte

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

PHONEME-TO-VISEME MAPPING FOR VISUAL SPEECH RECOGNITION

Topics: Audio and Speech Processing; Natural Language Processing; Signal Processing

In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, 322-329, 2012 , Vilamoura, Algarve, Portugal

Authors: Luca Cappelletta and Naomi Harte

Affiliation: Trinity College Dublin, Ireland

Keyword(s): AVSR, Viseme, PCA, DCT, Optical flow.

Related Ontology Subjects/Areas/Topics: Applications ; Artificial Intelligence ; Audio and Speech Processing ; Cardiovascular Imaging and Cardiography ; Cardiovascular Technologies ; Digital Signal Processing ; Health Engineering and Technology Applications ; Knowledge Engineering and Ontology Development ; Knowledge-Based Systems ; Multimedia ; Multimedia Signal Processing ; Natural Language Processing ; Pattern Recognition ; Signal Processing ; Software Engineering ; Symbolic Systems ; Telecommunications

Abstract: Phonemes are the standard modelling unit in HMM-based continuous speech recognition systems. Visemes are the equivalent unit in the visual domain, but there is less agreement on precisely what visemes are, or how many to model on the visual side in audio-visual speech recognition systems. This paper compares the use of 5 viseme maps in a continuous speech recognition task. The focus of the study is visual-only recognition to examine the choice of viseme map. All the maps are based on the phoneme-to-viseme approach, created either using a linguistic method or a data driven method. DCT, PCA and optical flow are used to derive the visual features. The best visual-only recognition on the VidTIMIT database is achieved using a linguistically motivated viseme set. These initial experiments demonstrate that the choice of visual unit requires more careful attention in audio-visual speech recognition system development.

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.77

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Cappelletta, L., Harte and N. (2012). PHONEME-TO-VISEME MAPPING FOR VISUAL SPEECH RECOGNITION. In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-8425-99-7; ISSN 2184-4313, SciTePress, pages 322-329. DOI: 10.5220/0003731903220329

@conference{icpram12,
author={Luca Cappelletta and Naomi Harte},
title={PHONEME-TO-VISEME MAPPING FOR VISUAL SPEECH RECOGNITION},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2012},
pages={322-329},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003731903220329},
isbn={978-989-8425-99-7},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - PHONEME-TO-VISEME MAPPING FOR VISUAL SPEECH RECOGNITION
SN - 978-989-8425-99-7
IS - 2184-4313
AU - Cappelletta, L.
AU - Harte, N.
PY - 2012
SP - 322
EP - 329
DO - 10.5220/0003731903220329
PB - SciTePress