loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Amit Meghanani and A. G. Ramakrishnan

Affiliation: Indian Institute of Science, Bangalore, India

Keyword(s): Pitch-synchronous, DCT, MFCC, Speaker Identification, Speaker Verification.

Abstract: We propose a feature called pitch-synchronous discrete cosine transform (PS-DCT), derived from the voiced part of the speech for speaker identification (SID) and verification (SV) tasks. PS-DCT features are derived from the ‘time-domain, quasi-stationary waveform shape’ of the voiced sounds. We test our PS-DCT feature on TIMIT, Mandarin and YOHO datasets. On TIMIT with 168 and Mandarin with 855 speakers, we obtain the SID accuracies of 99.4% and 96.1%, respectively, using a Gaussian mixture model-based classifier. In the i-vector-based SV framework, fusing the ‘PS-DCT based system’ with the ‘MFCC-based system’ at the score level reduces the equal error rate (EER) for both YOHO and Mandarin datasets. In the case of limited test data and session variabilities, we obtain a significant reduction in EER, up to 5.8% (for test data of duration < 3 sec).

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.218.127.141

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Meghanani, A. and Ramakrishnan, A. (2020). Pitch-synchronous Discrete Cosine Transform Features for Speaker Identification and Verification. In Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-397-1; ISSN 2184-4313, SciTePress, pages 395-401. DOI: 10.5220/0008911503950401

@conference{icpram20,
author={Amit Meghanani. and A. G. Ramakrishnan.},
title={Pitch-synchronous Discrete Cosine Transform Features for Speaker Identification and Verification},
booktitle={Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2020},
pages={395-401},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008911503950401},
isbn={978-989-758-397-1},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Pitch-synchronous Discrete Cosine Transform Features for Speaker Identification and Verification
SN - 978-989-758-397-1
IS - 2184-4313
AU - Meghanani, A.
AU - Ramakrishnan, A.
PY - 2020
SP - 395
EP - 401
DO - 10.5220/0008911503950401
PB - SciTePress