loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Hiroshi Fujimura ; Ning Ding ; Daichi Hayakawa and Takehiko Kagoshima

Affiliation: Toshiba Research and Development Center, Komukai-Toshiba-cho 1, Saiwai-ku, Kawasaki, 212-8582 Japan

Keyword(s): Keyword Detection, Speaker Identification, Low Resource Device, Bayesian.

Abstract: This paper proposes a new method for simultaneous flexible keyword detection and text-dependent speaker identification using a recognized keyword. The purpose is to identify a speaker from among a set of pre-registered speakers on the basis of a short-command utterance in an office or home on low-resource chip devices. The first contribution is to construct the process that includes a neural network (NN) and a customized Viterbi-based algorithm for flexible keyword detection, and Gaussian mixture models (GMMs) for speaker identification. Outputs of a middle layer in the NN and alignment information for keyword detection are also used for creating feature vectors for speaker GMMs. The second contribution is to apply DropConnect in speaker-modeling uncertainties of the Bayesian NN that is used for speaker reacognition. It results in robust speaker models when enrollment utterances are few. Evaluation was conducted using 39 Japanese keywords by 100 speakers. Recognition performance was measured on the basis of false acceptances and false rejects using keyword utterances. Speaker identification for 100 pre-registered speakers for recognized keywords was simultaneously evaluated. The identification rate when using a conventional i-vector method was 71.22%. By contrast, the identification rate of the proposed method was 89.29% while using low-cost resources. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.90.33.254

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Fujimura, H.; Ding, N.; Hayakawa, D. and Kagoshima, T. (2020). Simultaneous Flexible Keyword Detection and Text-dependent Speaker Recognition for Low-resource Devices. In Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-397-1; ISSN 2184-4313, SciTePress, pages 297-307. DOI: 10.5220/0008903202970307

@conference{icpram20,
author={Hiroshi Fujimura. and Ning Ding. and Daichi Hayakawa. and Takehiko Kagoshima.},
title={Simultaneous Flexible Keyword Detection and Text-dependent Speaker Recognition for Low-resource Devices},
booktitle={Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2020},
pages={297-307},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008903202970307},
isbn={978-989-758-397-1},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Simultaneous Flexible Keyword Detection and Text-dependent Speaker Recognition for Low-resource Devices
SN - 978-989-758-397-1
IS - 2184-4313
AU - Fujimura, H.
AU - Ding, N.
AU - Hayakawa, D.
AU - Kagoshima, T.
PY - 2020
SP - 297
EP - 307
DO - 10.5220/0008903202970307
PB - SciTePress