Length of Phonemes in a Context of their Positions in Polish Sentences

Magdalena Igras, Bartosz Ziółko, Mariusz Ziółko

2013

Abstract

The paper presents statistical phonetic data of Polish collected from a corpus. Lengths of phonemes vary from 5 ms to 670 ms. Average durations of Polish phonemes are presented as well as an important anomaly of longer phonemes in the end of sentences, which is the main topic of the paper. This observation can be used in speech recognition for automatic insertation of dots and sentence modelling. Data of 45 speakers, 5130 sentences in total, were described and compared with the values taken from the phonetic literature.

References

  1. Baron, D., Shriberg, E., and Stolcke, A. (2002). Automatic punctuation and disfluency detection in multiparty meetings using prosodic and lexical cues. pages 949-952.
  2. Christensen, H., Gotoh, Y., and Renals, S. (2001). Punctuation annotation using statistical prosody models. In in Proc. ISCA Workshop on Prosody in Speech Recognition and Understanding, pages 35-40.
  3. Demenko, G. (1999). Analiza cech suprasegmentalnych je¸zyka polskiego na potrzeby technologii mowy [Eng. Analysis of Polish Suprasegmentals for Suprasegmentals for Speech Technology]. Seria Je¸zykoznawstwo stosowane. Wyd. Naukowe Uniw. im. Adama Mickiewicza.
  4. Demenko, G., Wypych, M., and Baranowska, E. (2003). Implementation of grapheme-to-phoneme rules and extended SAMPA alphabet in Polish text-to-speech synthesis. Speech and Language Technology, PTFon, PoznaÁ, 7(17).
  5. Febrer, A., Padrell, J., and Bonafonte, A. (1998). Modeling phone duration: Application to catalan tts. In Proceedings of the Third ESCA/COCOSDA Workshop on Speech Synthesis. Jenolan Caves, Australia, pages 43-46.
  6. Frackowiak-Richter, L. (1973). The duration of Polish vowels. Speech analysis and Synthesis III, PWN.
  7. Glass, J. (2003). A probabilistic framework for segmentbased speech recognition. Computer Speech and Language, 17:137-152.
  8. Grayden, D. B. and Scordilis, M. S. (1994). Phonemic segmentation of fluent speech. Proceedings of ICASSP, Adelaide, pages 73-76.
  9. Grocholewski, S. (1997). CORPORA - speech database for Polish diphones. Proceedings of Eurospeech.
  10. Hockey, B. A. and Fagyal, Z. (1999). Phonemic length and pre-boundary lengthening: An experimental investigation on the use of durational cues in hungarian. Proceedings of the XIVth International Congress of Phonetics Sciences, San Francisco.
  11. Jassem, W. (1973). Podstawy fonetyki akustycznej (Eng. Rudiments of acoustic phonetics). Warszawa: PaÁstwowe Wydawnictwo Naukowe.
  12. Kolá?r, J., S? vec, J., and Psutka, J. (2004). Automatic punctuation annotation in czech broadcast news speech. pages 319-325, Saint-Petersburg. SPIIRAS.
  13. Morgan, N., Zhu, Q., Stolcke, A., Sonmez, K., Sivadas, S., Shinozaki, T., Ostendorf, M., Jain, P., Hermansky, H., Ellis, D., Doddington, G., Chen, B., Cretin, O., Bourlard, H., and Athineos, M. (2005). Pushing the envelope - aside. IEEE Signal Processing Magazine, 22:81-88.
  14. Ostendorf, M., Digalakis, V. V., and Kimball, O. A. (1996). From HMM's to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing, 4:360- 378.
  15. Pylkkönen, J. and Kurimo, M. (2004). Using phone durations in finnish large vocabulary continuous speech recognition.
  16. Russell, M. and Jackson, P. J. B. (2005). A multiple-level linear/linear segmental HMM with a formant-based intermediate layer. Computer Speech and Language, 19:205-225.
  17. Shepherd, M. (2011). The scope and effects of preboundary prosodic lengthening in Japanese. In USC Working Papers in Linguistics, pages 1-14.
  18. Shriberg, E., Stolcke, A., Hakkani-T ür, D., and T ür, G. (2000). Prosody-based automatic segmentation of speech into sentences and topics.
  19. Stöber, K. and Hess, W. (1998). Additional use of phoneme duration hypotheses in automatic speech segmentation. Proceedings of ICSLP, Sydney, pages 1595- 1598.
  20. Suh, Y. and Lee, Y. (1996). Phoneme segmentation of continuous speech using multi-layer perceptron. In Proceedings of ICSLP, Philadelphia, pages 1297-1300.
  21. Toledano, D., Gómez, L., and Grande, L. (2003). Automatic phonetic segmentation. IEEE Transactions on Speech and Audio Processing, 11(6):617-625.
  22. Weinstein, C. J., McCandless, S. S., Mondshein, L. F., and Zue, V. W. (1975). A system for acoustic-phonetic analysis of continuous speech. IEEE Transactions on Acoustics, Speech and Signal Processing, 23:54-67.
  23. Wierzchowska, B. (1980). Fonetyka i fonologia je¸zyka polskiego (Eng. Phonetics and phonology of Polish). Zaklad Narodowy im. OssoliÁskich, Wroclaw.
  24. Young, S. (1996). Large vocabulary continuous speech recognition: a review. IEEE Signal Processing Magazine, 13(5):45-57.
  25. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. (2005). HTK Book. Cambridge University Engineering Department, UK.
  26. Ziólko, B., Manandhar, S., Wilson, R. C., and Ziólko, M. (2011). Phoneme segmentation based on wavelet spectra analysis. Archives of Acoustics, 36(1).
  27. Ziólko, B. and Ziólko, M. (2011). Time durations of phonemes in polish language for speech and speaker recognition. Lecture notes in artificial inteligence, 6562:105-114.
  28. Zue, V. W. (1985). The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE, 73:1602-1615.
Download


Paper Citation


in Harvard Style

Igras M., Ziółko B. and Ziółko M. (2013). Length of Phonemes in a Context of their Positions in Polish Sentences . In Proceedings of the 10th International Conference on Signal Processing and Multimedia Applications and 10th International Conference on Wireless Information Networks and Systems - Volume 1: SIGMAP, (ICETE 2013) ISBN 978-989-8565-74-7, pages 59-64. DOI: 10.5220/0004503500590064


in Bibtex Style

@conference{sigmap13,
author={Magdalena Igras and Bartosz Ziółko and Mariusz Ziółko},
title={Length of Phonemes in a Context of their Positions in Polish Sentences},
booktitle={Proceedings of the 10th International Conference on Signal Processing and Multimedia Applications and 10th International Conference on Wireless Information Networks and Systems - Volume 1: SIGMAP, (ICETE 2013)},
year={2013},
pages={59-64},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004503500590064},
isbn={978-989-8565-74-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Signal Processing and Multimedia Applications and 10th International Conference on Wireless Information Networks and Systems - Volume 1: SIGMAP, (ICETE 2013)
TI - Length of Phonemes in a Context of their Positions in Polish Sentences
SN - 978-989-8565-74-7
AU - Igras M.
AU - Ziółko B.
AU - Ziółko M.
PY - 2013
SP - 59
EP - 64
DO - 10.5220/0004503500590064