loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Tsang-Long Pao and Wen-Yuan Liao

Affiliation: Tatung University, Taiwan

ISBN: 972-8865-40-6

ISSN: 2184-4321

Keyword(s): Audio-visual database, Audio-visual speech recognition, Hidden Markov model.

Related Ontology Subjects/Areas/Topics: Applications ; Computer Vision, Visualization and Computer Graphics ; Feature Extraction ; Features Extraction ; Image and Video Analysis ; Informatics in Control, Automation and Robotics ; Pattern Recognition ; Signal Processing, Sensors, Systems Modeling and Control ; Software Engineering ; Video Analysis

Abstract: For past several decades, visual speech signal processing has been an attractive research topic for overcoming certain audio-only recognition problems. In recent years, there have been many automatic speech-reading systems proposed that combine audio and visual speech features. For all such systems, the objective of these audio-visual speech recognizers is to improve recognition accuracy, particularly in the difficult condition. In this paper, we will focus on visual feature extraction for the audio-visual recognition. We create a new audio-visual database which was recorded in two languages, English and Mandarin. The audio-visual recognition consists of two main steps, the feature extraction and recognition.We extract the visual motion feature of the lip using the front end processing. The Hidden Markov model (HMM) is used for the audio-visual speech recognition. We will describe our audio-visual database and use this database in our proposed system, with some preliminary experiments.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.237.67.179

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Pao, T. and Liao, W. (2006). AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES.In Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, ISBN 972-8865-40-6, pages 192-196. DOI: 10.5220/0001369101920196

@conference{visapp06,
author={Tsang{-}Long Pao and Wen{-}Yuan Liao},
title={AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES},
booktitle={Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP,},
year={2006},
pages={192-196},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001369101920196},
isbn={972-8865-40-6},
}

TY - CONF

JO - Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP,
TI - AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES
SN - 972-8865-40-6
AU - Pao, T.
AU - Liao, W.
PY - 2006
SP - 192
EP - 196
DO - 10.5220/0001369101920196

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.