RAINBOW BRIDGE

Training Center based on Voice Technology for People with Physical Disabilities

Jan Nouza, Petr Cerva and Josef Chaloupka

Institute of Information Technology and Electronics, Technical University of Liberec, Studentska 2, Liberec, Czechia

Keywords: Voice technology, Motor disabled persons, Speech recognition, Hands-free control.

Abstract: The Rainbow Bridge is an EU-supported project that aims to teach Czech people with physical disabilities

to benefit from voice technology products in their lives and prospective jobs. All products - voice control

tools and dictation programs - have been developed in our lab during the last decade. At this stage, the main

goal is to teach the target group how to use PCs in hands-free manner (though some have no previous

experience with computers), how to adapt the tools to their specific speech abilities and how to acquire the

skills that can be useful for their potential employers. One of the lecturers is a quadriplegic person who

herself has been using our voice products for almost 5 years.

1 INTRODUCTION

Voice technology has a great potential as an

alternative means for e-inclusion of people with

physical disabilities. This was demonstrated already

in the early years of computer speech processing

(Noyes et al., 1989) and it is even more evident now

when operation systems with embedded speech

input and output have become widely available. At

the end of the first decade of the 21st century, voice

is considered as an optional and complementary

means in human computer interaction.

However, there is a big difference between

common PC users and those with physical

disabilities (Hawley et al, 2005). For people whose

impairment has not allowed them to use computers

so far, it is not easy to start now and learn by

themselves. Recent SW applications utilize

sophisticated (mostly graphically oriented) user

interfaces where the mouse and keyboard are the

primary tools for interaction. When voice commands

are available, they often mimic the actions of these

standard tools. This is natural for the majority of the

users but difficult to understand for the people who

have never worked with computers before because

their impairment had not allowed it. Should they

learn the computer basics in the same way as the

other users? Or should we provide them with special

courses where the alternative interaction tools

eligible for them, such as voice input and output,

would be presented as the primary ones? These and

several other questions were raised when we were

asked to help in preparing a project of a training

center aimed at physically disabled people who want

to start with computers.

The need for such a training center has become

urgent when voice recognition programs designed

for Czech motor-handicapped people appeared on

the market. In 2005 it was MyVoice - a voice-

controlled interface for PCs (Nouza et al, 2005), in

2007 MyDictate - a very-large vocabulary discrete

dictation tool (Cerva et al, 2005), and in 2008,

NewtonDictate - a program for fluent dictation with

300K+ word lexicon, all developed in our research

team. Only a small part of the potential users have

been able to learn to use them effectively. It was

mainly the people who had already worked with

computers before they suffered some sort of motor

impairment. Some others succeeded because they

found helpful assistants within their families or

friends. But many others just gave up after a short

initial trial that did not bring immediate success due

to various reasons, such as disordered speech, wrong

usage or little patience. And last but not least, many

potential users have not learned about the tools, yet.

In 2009, the project called Rainbow Bridge was

prepared and submitted to the EU-supported

Operational Program Human Resources and

Employment. Its goal was to build the training

center in Prague where people from the target group

could learn about the available technology, try it

under the supervision of professional teachers,

529

Nouza J., Cerva P. and Chaloupka J..

RAINBOW BRIDGE - Training Center based on Voice Technology for People with Physical Disabilities.

DOI: 10.5220/0003147905290533

In Proceedings of the International Conference on Health Informatics (HEALTHINF-2011), pages 529-533

ISBN: 978-989-8425-34-8

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

receive individual care and assistance, and also share

their experience with the others. One of the top

priorities was to give at least some of the trainees a

chance to get a job where they could employ their

new skills. Eventually, the project was approved and

the center launched its activities in 2010.

In the following sections, we first describe the

set of tools developed for the clients of the center,

then we present the teaching methods and finally we

discuss the experience gained during the preparation

of the courses.

2 VOICE CONTROLLED TOOLS

Here we briefly present the individual tools that

have been developed for and are being used in the

training center.

2.1 Voice Interface for PC

Program MyVoice (Nouza et al, 2005) serves as an

interface between commands spoken to the

microphone and standard applications running on a

PC. It either complements or (in most cases)

replaces keyboard and mouse as shown in Fig.1.

Figure 1: Voice control of PC applications - basic scheme.

The Myvoice is mostly used as a tool that launches

and controls applications, enables typing (of letters,

frequent words and phrases), and provides access to

internet services. Anyway, it allows for virtually any

activity that is normally performed via a keyboard

and mouse. Compared to the first release in 2005,

the latest version includes some new features. Let us

mention at least the improved navigation on web

pages, either by adding numbers to the web links or

by the introduction of a target-like grid that

simplifies mouse cursor movements. It is also

possible to hide MyVoice’s window, which may be

useful, for example, when watching video in the

full-screen mode.

Figure 2: A snapshot of voice controlled internet browser.

2.2 Voice Interface for Home Devices

The MyVoice tool can be further expanded to allow

for controlling common home devices, like e.g.

lights, electrical heating or air conditioning, window

curtains, doors, or audio and video equipment.

The extension module includes a hardware unit

that contains two types of transmitters: one whose

signal is compatible with wireless relays used to

power home electric appliances, and an IR

transmitter to control a TV set, stereo, video and a

satellite receiver. The HW unit is linked to a PC via

USB connection. Not only the devices, but also the

user can communicate with the control program in a

wireless way. Our tests showed that even the cheap

Bluetooth microphones, either headworn or lapel

ones, proved to work well with the MyVoice.

In 2007 we built a demonstration room where all

common home devices were controlled via the

MyVoice and wireless links (Nouza et al, 2007). Its

diagram is shown in Fig. 3. The same arrangement

will is used in the training centre in Prague.

Figure 3: Voice control extended to home devices.

2.3 Hands-free Dictation Programs

During the last 5 years we have developed two types

of dictation programs, one for discrete speech input,

another for natural fluent speech.

HEALTHINF 2011 - International Conference on Health Informatics

530

2.3.1 Discrete Dictation

The program for discrete speech input is called

MyDictate. Its design has been fully determined by

the target group, i.e. by the people who cannot use

their hands. Therefore, not only the dictation itself

but also all supporting activities, like editing, error

correction, formatting, or lexicon maintenance had

to be designed as hands-free operations. The

program has been described in detail in (Cerva et al,

2005).

The SW is distributed with a general purpose

lexicon containing about 550 thousand words. The

employed technology allows the program to run

even on recent low-cost PCs. When a word is

uttered, the recognizer outputs the ordered list of 10-

best candidates, taking into account the acoustic as

well as the language model score. The word with the

best score is automatically added to the dictated text

while the next 9 candidates appear on the list shown

in MyDictate’s window. In case of lexical ambiguity

or when a minor recognition error occurs, the user

can take another candidate from this list and replace

the wrongly typed item. There are about 100 other

control commands that can be used e.g., to delete the

last character(s), word(s), or sentence(s), to select a

part of text, to work with the clipboard, to move the

cursor, to spell the individual letters or to toggle

their (lower or upper) case. The basic vocabulary

assures about 99 % coverage rate for common Czech

texts. If a dictated word is not in the lexicon, the user

can add it by voice during the dictation session.

As the program is aimed particularly at people

with physical disabilities, it must be able to cope

with less standard pronunciation. If this is the case,

the user can employ the embedded speaker

adaptation module. It prompts him/her to say 300

phonetically balanced words. For most users it helps

to reduce the recognition error up to 25 % relatively.

2.3.2 Fluent Dictation

The software for fluent dictation was developed by

in collaboration with the Newton Technology

company, which distributes it under name

NewtonDictate. This software is aimed at general

public, and at professions, like lawyers, doctors and

people from the media domain. It comes with

several types of lexicons. The general-purpose one is

the largest and it contains 500K words recently. The

profession oriented lexicons are smaller (320K for

lawyers and 140K for medicine) with domain

specific language models.

Though originally, this program has not been

designed for hands-free use, it has been considered

for the lectures and training sessions in the center.

One reason is that some of the clients showed their

interest in learning this software and exploiting it in

their prospective jobs. It was mainly those people

whose physical disability does not affect their

speech and who can use their hands at least to some

extent. For the other clients, integration with the

MyVoice is being prepared. It is truth, however, that

many persons with disabled hands prefer the discrete

dictation to the fluent one. They appreciate namely

the facts that a) they can form the text in their own

pace (while the fluent speech technology requires

more or less continuous flow of words), b) they can

correct or modify the input text immediately, c) they

have feeling that the isolated-word decoder is more

robust to speaker-produced and background noises

as well as to hesitation sounds, and last but not least

d) they can easily add new words to the lexicon. It

should be also noted that in Czech - because of its

rich and complex morphology - many word-forms

differ only in one or two characters, which means

that they sound very similar and can be easily

confused, particularly in fluent speech. For the users,

correcting these small errors spread within a fluently

input text is a frustrating task, if it must be done in

hands-free manner. Similar observations were

reported also in (Hawley et al, 2005).

3 TRAINING CENTER

The Rainbow Bridge project has been supported by

a 200K Euro grant, from which one half has been

spent by building the training center and the other

will cover the running costs of a three-year pilot

operation. The center is situated in Prague in place

with good access by public and private transport.

3.1 Project Goals

The main goals of the project are:

Promotion of the voice technology among those

people with physical disabilities who can use it as an

alternative means of interaction with computers.

Creation of a pilot training center with certified

teaching methods (which can be later replicated on

regional levels).

Teaching the basic computer skills to the people

whose disability had never allowed them working

with PCs.

Giving at least some of the trainees a chance to

employ PCs together with the voice technology in

their prospective jobs, e.g. in call and help centers,

in voice-scanning of documents, in re-speaking

RAINBOW BRIDGE - Training Center based on Voice Technology for People with Physical Disabilities

531

tasks, etc.

3.2 Trainees

A national survey (CSO, 2008) estimates that the

people with physical disabilities make up about 5 %

of the total population in the Czech Republic. In the

Prague region, this means circa 50,000 persons,

from which about 5,000 suffer from complete or

partial hand paralysis. This pilot project can offer a

chance to 50 of them.

Because of this limited number, the selection of

the course participants is done so that people with

different levels of motor impairment, different

computer skills and different voice ability can

participate in multiple course runs. Allowing this

rather wide spectrum of clients gives this pilot

project a good opportunity to develop and verify

various general and specific training methods.

3.3 Teaching Staff

The teaching and project supporting staff consists of

several professions: a teacher specialized on

education of disabled people, a teacher familiar with

ICT and voice technology, a psychologist, a voice

therapist, and a lecturer who himself or herself uses

the voice technology tools in his/her daily living.

During the first year, the latter position was

offered to a quadriplegic woman who has been using

the MyVoice and MyDictate software since their

first dates of release. She has utilized them for her

high school study (organized from her home and

successfully completed in 2009). Recently, she has

become known also for her blogs published

regularly in a major Czech internet newspaper.

3.4 Equipment

The center consists of one room with 8 work places,

each equipped with a PC and voice technology

software. The room is used for two types of

activities: First, for the lectures when up to 8

trainees can participate and watch the presentations

and demonstration on their monitors, and second, for

training sessions when no more than 2 students work

on assignments under the supervision of a teacher.

The center also owns 8 laptops with the same

software equipment. These can be lent to the clients

for their home work. The room is equipped with the

remote controlled devices mentioned in section 2.2.

4 TRAINING METHODS

4.1 Pilot Phase of Training Courses

The project started in January 2010 and it will end in

December 2012. Within the first six months, a pilot

phase with 2 trainees is running. One of them is a

person who suffered a serious spinal injury two

years ago but before that he had worked with a PC.

The other person also cannot use her hands but,

unfortunately, she has not got any prior knowledge

of computers. Moreover, she suffers from moderate

dysarthria.

These two people were the subjects in the

preparatory phase of the courses. The fist trainee

needed very short time to learn how to replace the

keyboard and mouse by voice in the activities he had

been doing before his injury. After a 4-week training

he has been able to use the MyVoice and MyDictate

programs very efficiently. Currently, he learns to

exploit the fluent dictation tool as well. His

experience and feedback helps us in completing the

integration of the dictation and voice command

control tools.

The other person requires much more individual

care. In her case, the MyVoice tool has been tailored

and adapted to her specific speech. The command

groups have been modified in the way that they

included significantly smaller numbers of task

oriented commands, like basic cursor movements,

easily pronounceable and distinguishable names of

Czech letters, commands for basic navigation on

web pages, etc.

4.2 Standard Courses

Standard courses start in September 2010. Each

course run has 8 participants and lasts for 3 months.

It consists of 1 month period of lecturing and basic

training (60 contact hours in the center + 40 hours

homework) and 2 months of individual practicing

(80 contact + 80 homework hours). The practical

sessions are organized so that one supervisor is

available for two trainees.

The curriculum includes these main topics:

 introduction into PCs, navigation within the MS

Windows platform (alternative use of desktop, Start

button, shortcuts, mouse actions, window actions,

etc.),

 using the MyVoice instead of a keyboard and

mouse, hints for correct and efficient use of voice,

optional adaptation to speaker’s voice, practicing

and developing the voice-command skills on simple

applications, such as the Solitair,

HEALTHINF 2011 - International Conference on Health Informatics

532

 game, the Calculator, the Paint tool, a music

player, etc.

 text typing and editing within the MS Word

program using the tools for discrete and fluent

dictation,

 internet and web services, using the tools for web

browsing, e-mail and IP telephony communication,

social networks.

The topics are introduced by a teacher first and then

presented by the assistant lecturer (usually one of the

previously trained handicapped users) who explains

the practical issues of the voice-command actions

using his or her own previously acquired experience.

4.3 Course Benefits

Each of the course participants will learn how to

exploit voice technology tools for his or her daily

activities, for study, for leisure, for establishing

internet contacts with other people, or for emergency

situations. Moreover, we believe that at least some

of the graduates will be given a chance to use their

new skills in suitable jobs. The center itself will

offer a temporary lecturer job to the best course

participant. There are also provisional agreements

with several institutions in Prague where the best

graduates can be employed, e.g. as assistants in call

and help centers or as re-speakers in companies that

specialize on transcription of broadcast or meeting

speech.

5 CONCLUSIONS

This project is the first activity of this type organized

on the national level and supported by the European

Union. It gives a new chance to the people who have

not been able to use modern information and

communication technology so far because of their

physical handicap. It is also a great opportunity for

the researchers in the speech technology domain to

see how their products perform in rather challenging

conditions and how these products can be further

improved. If the project succeeds, similar centers

will be built also on regional levels.

ACKNOWLEDGEMENTS

The research reported in this paper was partly

supported by the Czech Science Foundation in

project no. 102/08/0707 and by the EU-supported

CZ.2.17/2.1.00/32644 grant in programme Prague –

Adaptibility.

REFERENCES

Noyes, J., Haigh, R., Starr A., 1989: Automatic speech

recognition for disabled people. Applied Ergonomics,

Vol.20, no.4, pp. 293-298.

Hawley, M.S., Green, P., Enderby, P, Cunningham, S.,

Moore, R., 2005: Speech technology for e-inclusion of

people with physical disabilities and disordered

speech. Proc. of Interspeech 2005, Lisboa, pp.445-448

Nouza, J., Nouza, T., Cerva, P., A Multi-Functional

Voice-Control Aid for Disabled Persons”. Proc. of

Specom 2005, Patras, 2005, pp.715-718.

Cerva, P., Nouza J.: Design and Development of Voice

Controlled Aids for Motor-Handicapped Persons.

Proc. of Interspeech, Antwerp, 2007, pp. 2521-2524.

Nouza, J., Chaloupka, J., Zdansky, J., Silovsky, J., Kroul,

M., Mader, Z.: Voice Controlled Center for Homes of

Motor-Handicapped Persons. Proc. of Specom 2007,

Moscow, 2007, pp. 714-719.

Desilets, A.: VoiceGrip: A Tool for Programming-by-

Voice. International Journal of Speech Technology.

Vol. 4, No. 2, June 2001, pp. 103-116.

CSO (Czech Statistical Office): Results from survey of

disabled persons in 2007. Published by Czech

Statistical Office in 2008 and available on web:

http:/www.czso.cz/csu/2008edicniplan.nsf/t/4100269D

D7/$File/330908j3.pdf

RAINBOW BRIDGE - Training Center based on Voice Technology for People with Physical Disabilities

533