USER INTERFACE DESIGN FOR VOICE CONTROL SYSTEMS

Wolfgang Tschirk

2004

Abstract

A voice control system converts spoken commands into control actions, a process which is always imperfect due to errors of the speech recognizer. Most speech recognition research is focused on decreasing the recognizers’ error rates; comparatively little effort was spent to find interface designs that optimize the overall system, given a fixed speech recognizer performance. In order to evaluate such designs prior to their implementation and test, three components are required: 1) an appropriate set of performance figures of the speech recognizer, 2) suitable performance criteria for the user interface, and 3) a mathematical framework for estimating the interface performance from that of the speech recognizer. In this paper, we will identify four basic interface designs and propose an analytical approach for predicting their respective performance.

References

  1. Glass, J., Polifroni, J., Seneff, S., and Zue, V. (2000). Data collection and performance evaluation of spoken dialogue systems: The MIT experience. Massachusetts Institute of Technology.
  2. IEEE (1994). Special section on robust speech recognition. In IEEE Transactions on Speech and Audio Processing vol. 2, no. 4, pp. 549-643, October 1994.
  3. IEEE (2002). Special issue on automatic speech recognition for mobile and portable devices. In IEEE Transactions on Speech and Audio Processing vol. 10, no. 8, pp. 529-658, November 2002.
  4. Lin, B.-S. and Lee, L.-S. (2001). Computer-aided analysis and design for spoken dialogue systems based on quantitative simulations. In IEEE Transactions on Speech and Audio Processing vol. 9, no. 5, pp. 534- 548, July 2001.
  5. Niimi, Y. and Nishimoto, T. (1999). Mathematical analysis of dialogue control strategies. In Proceedings of EUROSPEECH 99 vol. 3, pp. 1403-1406.
  6. Papoulis, A. (1984). Probability, Random Variables, and Stochastic Processes. McGraw-Hill, New York.
  7. Rabiner, L. and Juang, B.-H. (1993). Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ.
  8. Souvignier, B., Kellner, A., Rueber, B., Schramm, H., and Seide, F. (2000). The thoughtful elephant: Strategies for spoken dialog systems. In IEEE Transactions on Speech and Audio Processing vol. 8, no. 1, pp. 51-62, January 2000.
  9. Tschirk, W. (2001). Neural net speech recognizers. Voice remote control devices for disabled people. In e & i Arti cial Intelligence 7/8/2001, pp. 367-370. Springer.
  10. Villarrubia, L. and Acero, A. (1993). Rejection techniques for digit recognition in telecommunication applications. In Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing 1993, pp. 455-458.
Download


Paper Citation


in Harvard Style

Tschirk W. (2004). USER INTERFACE DESIGN FOR VOICE CONTROL SYSTEMS . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 5: ICEIS, ISBN 972-8865-00-7, pages 21-26. DOI: 10.5220/0002602100210026


in Bibtex Style

@conference{iceis04,
author={Wolfgang Tschirk},
title={USER INTERFACE DESIGN FOR VOICE CONTROL SYSTEMS},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 5: ICEIS,},
year={2004},
pages={21-26},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002602100210026},
isbn={972-8865-00-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 5: ICEIS,
TI - USER INTERFACE DESIGN FOR VOICE CONTROL SYSTEMS
SN - 972-8865-00-7
AU - Tschirk W.
PY - 2004
SP - 21
EP - 26
DO - 10.5220/0002602100210026