Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz

Study

Luka Rukoni

1 a

, Marie-Anne Pungu Mwange

2 b

and Suzanne Kieffer

1 c

Institute for Language and Communication, Universit

e catholique de Louvain, Louvain-la-Neuve, Belgium

AISIN Europe, Braine-L’Alleud, Belgium

Keywords:

Interactive Driver Tutoring, Mental Models, Car Voice Assistant, Wizard of Oz, Advanced Driver Assistance.

Abstract:

Understanding the limitations and capabilities of the advanced driver-assistance systems (ADAS) is a pre-

requisite for their safe and comfortable use. This paper presents a formative user study on the use of a

dialogue-based system, implemented using the Wizard of Oz (WOz) technique, to help drivers learn about

the correct use of driving assistance. We investigated whether drivers would build the correct mental model of

the driving assistance systems through natural language dialogue. We describe the evolution of the prototype

over four iterations of formative evaluation with older and younger drivers. Using a mixed-method approach,

combining the WOz, interviews, questionnaires, and a knowledge quiz, we evaluated the prototype of a voice

assistant and identiﬁed the teaching content objectives. Participants’ mental model about ADAS was assessed

to evaluate the efﬁcacy of the teaching approach. The results show that the teaching goals need to be clearly

communicated to the drivers to ensure the adoption of the VA.

1 INTRODUCTION

Advanced Driver-Assistance Systems (ADAS) are

present in almost every new vehicle sold on the mar-

ket. According to EU regulations, all new vehicles

will need to be equipped with at least level 2 automa-

tion according to SAE taxonomy (SAE, 2018). ADAS

leads to increased driving safety and comfort, how-

ever, only if used correctly and consciously. Some

car models offer much more than basic lane-keeping

and adaptive cruise control systems, such as lane de-

parture warning, automatic emergency braking, ob-

stacle and object detection and driver state monitor-

ing. Although useful, many drivers are not aware of

all the systems their car is equipped with, which leads

to a low usage and adoption rate of these systems.

In addition, many drivers do not receive proper train-

ing about the availability and capability of the ADAS

systems when purchasing a new car. Besides, learn-

ing about automation systems from user manuals is

also insufﬁcient to build a correct mental model of

the systems (Boelhouwer et al., 2019). Because of

that, drivers learn how to use ADAS systems by them-

https://orcid.org/0000-0003-1058-0689

https://orcid.org/0000-0003-2971-6347

https://orcid.org/0000-0002-5519-8814

selves, on public roads, where they test how the vari-

ous systems work or when they stop functioning. As

a result, driving safety beneﬁts might be waived due

to wrongly constructed drivers’ mental models about

how ADAS functions (Rossi et al., 2020). Further,

better understanding of the limitations and boundaries

of driving automation systems leads to improvements

in drivers’ trust calibration which is a crucial aspect

to their safe use (Walker et al., 2018)

This study primarily focused on drivers above 50,

who are underrepresented in research studies (Young

et al., 2017), but are highly interested in the beneﬁts of

using ADAS, despite concerns about security issues,

system failures or hacking attacks(Schmargendorf

et al., 2018). Speciﬁcally, we investigated the use of

an interactive driver tutoring system with a voice as-

sistant (VA) to teach drivers about the correct use of

ADAS. Using a Wizard of Oz (WOz) simulating the

car-driver dialogue, we explored how drivers would

interact with a VA using natural language. We fo-

cused on teaching drivers about the capabilities and

limitations of the Lane Keeping Assistant (LKA) and

Adaptive Cruise Control (ACC) while driving on the

highway. We collected data using a high-ﬁdelity

simulator platform. We combined subjective, self-

reported data with interviews to evaluate the user ex-

perience (UX) with the prototype.

Rukoni

c, L., Mwange, M. and Kieffer, S.

Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz Study.

DOI: 10.5220/0010913900003124

In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 2: HUCAPP, pages

88-98

ISBN: 978-989-758-555-5; ISSN: 2184-4321

The speciﬁc objectives of this study were: (1) ex-

plore how a voice assistant could be used to facilitate

learning about ADAS while driving; (2) evaluate the

driver’s trust in ADAS; (3) evaluate the mental model

of drivers about ADAS.

2 RELATED WORK

Research on the use of voice interfaces in cars is plen-

tiful. Politis et al. Politis et al. (2018) evaluated the

different types of dialogue-based systems to medi-

ate the takeover in automated driving by increasing

the situational awareness of the driver. They found

that an interface without additional spoken informa-

tion was most accepted, followed by the interface

where drivers had to respond to questions asked by

the system about the driving situation such as haz-

ards, current lane, speed or fuel level. Schmidt et al.

(2020) evaluated a proactive and reactive voice assis-

tant in use-cases such as navigating, refueling, and

checking the news. Proactivity was generally well ac-

cepted, but not for navigation use cases. Voice alerts

before automated braking have a positive impact on

driver’s anxiety, alertness, and sense of control (Koo

et al., 2016). Other applications include using voice

to guide a takeover from automated to manual driv-

ing (Kasuga et al., 2020), or as an auditory reliability

display to support and relax drivers in high levels of

automation (Frison et al., 2019).

2.1 Driver Tutoring

Merriman et al. (2021) identiﬁed driver training as a

solution to address four challenges associated with the

use automated vehicles, which are: (1) Drivers have

a poor mental model of the automation’s function,

capabilities, and limitations; (2) Automation reduces

driver’s cognitive workload thus decreasing driver’s

attention to the road; (3) Over-reliance on automa-

tion and reduction of driver’s self-conﬁdence to drive

well manually without assistance; (4) Degradation of

driver’s procedural skills to drive manually. Our study

focuses on the ﬁrst challenge by addressing the align-

ment of the driver’s mental model with the capabili-

ties and limitations of the driving-assistance systems.

Our prototype serves as the ﬁrst step towards voice-

guided teaching while driving.

Forster et al. (2019; 2020) evaluated the effect of

user education using owner’s manuals and interactive

tutorials on driver’s mental model creation, satisfac-

tion, understanding, and performance with automated

systems. Both approaches showed superior operator

behavior compared to a generic baseline, however,

no difference in satisfaction was found. Boelhouwer

et al. (2019) found that informing drivers about car’s

automation capabilities using owner’s manual was not

sufﬁcient to build accurate mental models of driving

automation. In addition, existing systems for in-car

driver tutoring rely on AR overlay on the windscreen

and auditory explanations of the ADAS functionali-

ties in low-complexity driving situations (Boelhouwer

et al., 2020).

Furthermore, Rossi et al. (2020) showed that peo-

ple who built their mental model of ADAS based on

reading a description about it, use ADAS more ef-

fectively and safely compared to those who build the

mental model on their own. Past research calls for

a structured approach to user education about driving

automation that relies on guided learning and building

declarative knowledge (Forster et al., 2019). Building

a correct or accurate mental model of driving automa-

tion and driving assistance technologies is crucial for

using it appropriately. Mental models represent what

people know and believe about the system they use

(Payne, 2012). This approach focuses on the con-

tents of the user’s mind, rather than the explanation of

cognitive processes for building the representations of

systems. The knowledge people possess can be used

to explain the behavior with the system in question.

2.2 Trust

Overtrust in ADAS can lead to decreased driving

safety and comfort. This is caused by a lack of

knowledge about the system’s boundaries and capa-

bilities. Walker et al. (2018) found that drivers tend to

overtrust Level 2 vehicle’s automation capabilities be-

fore trying it in real-life situations. After the on-road

experience, drivers understood better what the capa-

bilities of the ADAS were, and this led to a decrease

in trust in 7 out of 12 scenarios. According to Hoff

and Bashir (2015), trust is a multidimensional con-

cept consisting of dispositional, situated, and learned

trust. Preconceptions about the system play an im-

portant role in forming trust. A study by Lasarati

et al. (2020) investigated the effect of four textual ex-

planation styles on users’ trust in AI medical diag-

nostic systems. They found that thorough and con-

trastive styles were rated the highest. Proactive di-

alogue strategies, promoting acting to the user be-

fore problematic situations happen are found to in-

duce higher levels of trust when interacting with in-

telligent assistants such as robots (Kraus et al., 2020).

We used this assumption when designing our proac-

tive voice assistant for driver tutoring.

Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz Study

(a) Participants drove on L3 and used L2 to avoid obstacles. (b) Route indicated in blue on the loop highway in CARLA.

Figure 1: Lane conﬁguration and road map used in the study.

3 USER STUDY

3.1 Iterative Design

We adopted a formative test-and-reﬁne approach to

the design of the teaching assistant similar to Rukonic

et al. (2021). We conducted four iterations with a

small sample of participants to evaluate the proto-

types. Each new iteration saw improvements to the

prototype and the teaching content. This study does

not elaborate in detail the results and ﬁndings in each

iteration, but rather provides an overview of what we

learned and how we applied that to the incremental

development of the prototype. Thus, after a brief de-

scription of each iteration we focus on the main qual-

itative ﬁndings and lessons learned. Between itera-

tions, the user proﬁle of participants varied, as well

as the data we collected to accommodate for the in-

vestigation of aspects we found relevant based on the

ﬁndings from the previous iteration. We aimed at col-

lecting qualitative data rather than investigating sys-

tematic differences between conditions through ex-

perimental manipulations. WOz is a great support in

the iterative design process that leads the implemen-

tation from early phases toward the design of the ﬁnal

version (Dow et al., 2005).

Iteration 1. required participants to drive two dif-

ferent driving scenarios, namely, a baseline scenario

(scenario B) and a scenario with on voice assistant

(scenario VA). All participants drove both scenar-

ios. First, we ran scenario B to establish a baseline

driver behavior and to investigate their use of avail-

able ADAS systems. Then, we ran scenario VA to

check whether the VA actually supported drivers in

their learning about ADAS systems. We designed a

use case in which people drove on a highway and

were strongly hinted to use driving assistance.

In scenario B no advice was given to the partic-

ipants, so that they would build a mental model of

how to operate the LKA and ACC on their own, i.e.,

without voice assistance. They drove ﬁve laps on the

highway (Figure 1b), which lasted around 10 minutes.

Drivers encountered two obstacles, both necessitating

a takeover. The car detected obstacle#1 (truck), which

led to an activation of the emergency braking if they

did not react. In contrast, the car did not detect obsta-

cle#2 (container on the road), which led to a collision

if they did not perform an emergency maneuver (Fig-

ure 1a). In scenario VA, we used two language styles

(expert vs. helper) to write the advice for each type

of road event. These two styles were modeled after

the explanation goals presented by Sørmo and Cas-

sons (2005). They presented four explanation goals

for case-based reasoning, of which we selected two

and applied them to the design of our advice content.

The two explanation styles are based on the Trans-

parency goal (expert style), and Learning goal (help-

ing style). The expert style aims at explaining how

the system reached the answer, while the helping style

aims at teaching the user about the ADAS domain. In

the VA condition, roadworks and a police car were

used as obstacles. The police car triggering a similar

reaction as the truck of scenario B and a sign announc-

ing the roadworks taking the place of the container of

the previous use case.

Iteration 2. improved reactivity and proactivity of

the teaching assistant for ADAS. Also, the steering

wheel was turning while LKA was on. The goal was,

therefore, to explore users’ attitude towards a VA that

guides them through each ADAS feature. Two scenar-

ios were designed, where Scenario 1 aimed at driver’s

learning about the ADAS systems, while Scenario 2

aimed at putting their knowledge at test by making

the right decisions when avoiding obstacles and us-

HUCAPP 2022 - 6th International Conference on Human Computer Interaction Theory and Applications

age of safety systems such as forward collision warn-

ing (FCW) and automatic emergency braking (AEB).

We removed the baseline scenario and made the VA

available in both scenarios. Before Scenario 1, we

demonstrated the VA by guiding participants through

a short conversation with it to build the understanding

of how it worked. Instead of having to use the VA

immediately in the Scenario 1 as in iteration 1, we

learned that participants needed to become familiar

with the VA ﬁrst as well. This way their efforts were

less focused on dealing with learning how to use the

VA, and more on learning about ADAS. Moreover, we

decided to write the teaching content using only one

language style, since we could not evaluate its effects

on our small sample size.

Iteration 3. focused on the transition between au-

tonomous and manual driving modes, precisely on

preparing drivers to take over and to hand over the

control. These events took place when getting on

or leaving the highway, e.g. to change the direction

or take the cloverleaf interchange roads (Fig. 1b).

ADAS was available only on the highway loop and

in case there were no obstacles ahead. ADAS was

not available on side roads, interchange loops, and

when an obstacle was announced. The VA content

was written with these situations in mind. Scenario

1 covered the transition use-case and the availability

of autonomous driving depending on the road type.

Scenario 2 focused on the interaction with and use of

ADAS while the driver was performing a secondary

task, also called a non driving-related task (NDRT),

i.e. making a phone call. We told participants to use

the VA to learn about capabilities and limitations of

ADAS, trafﬁc situation, and takeover procedure. We

used NDRTs to put the driver out of the loop (OOTL),

i.e. in situations where drivers are not monitoring

the driving situation, but may or may not be phys-

ically controlling the car (Merat et al., 2019). Being

OOTL causes a loss of situational awareness and com-

promises drivers’ monitoring or takeover capabilities

(Merat et al., 2019). We aimed to explore the po-

tential role of the VA to help maintain driving safety

and situational awareness while the driver is making

a phone call. We hypothesized that a low situational

awareness caused by the NDRT would naturally mo-

tivate drivers to use the VA, so as to rebuild it. As re-

search in UX tells us that when people have a goal in

mind, they use the appropriate technology to achieve

it (Hassenzahl, 2018).

Iteration 4. required participants to execute a series

of tasks with the VA. Scenario 1 focused on learn-

ing about ADAS. The tasks were presented to them

on a tablet placed on their left-hand side next to the

steering wheel. We allocated a few minutes before

the scenario started to read the tasks thoroughly and

get familiar with them. They could ask us for clar-

iﬁcations if needed. All tasks were presented on the

screen at the same time. We told participants that they

were free to choose when to complete each task and

their order. However, we suggested following the or-

der of tasks as shown on the screen. Scenario 2 was

designed to collect interaction data with the VA in ob-

stacle avoidance situations.

3.2 Teaching Content Design

The process of learning is divided into three stages:

acquiring declarative and procedural knowledge, con-

solidating the acquired knowledge, and tuning the

knowledge towards overlearning (Kim et al., 2013).

We designed the teaching content of our study with

the aim of building drivers’ declarative knowledge.

It focuses on facts and things people know about a

system, which in our case were ADAS systems such

as ACC and LKA. Conversely, procedural knowledge

explains how to solve a problem or achieve a task us-

ing a system (Forster et al., 2019; Kim et al., 2013).

However, declarative knowledge is prone to fast de-

cay in case of infrequent use. Knowing this limita-

tion, we believed that acquiring declarative knowl-

edge while practicing the use of ADAS in parallel

might help build the procedural knowledge, which is

immune to decay. To that end, we measured driver’s

learning and retention of declarative knowledge about

ADAS in this study.

3.3 Participants

Each iteration involved six participants holding a

valid driver’s license. All participants were given an

incentive for taking part in the study. Iterations 1-3

only involved people above 50 years old, who were

active users of ADAS including ACC and LKA, as

50+ drivers were the focus of the research project we

were working on. Iteration 4 involved both young

and older drivers with no existing experience with

ADAS, as VA might be more useful to drivers who

begin using it. Drivers with previous experience with

ADAS might have built a mental model that is hard to

change through the acquisition of declarative knowl-

edge about ADAS and its related features. We con-

sidered someone an older driver was a person above

50 and holding their driver’s license for more than 20

years. Conversely, a younger driver was a person be-

tween 22 and 30 years old, holding their driver’s li-

cense for a maximum of three years.

Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz Study

(a) Fixed-base simulator setup. (b) HUD automation status icon and speed indicator.

Figure 2: Simulator setup and head-up-display (HUD) icons.

3.4 Materials and Apparatus

The study was conducted using a ﬁxed-base driving

simulator. It consisted of three 50-inch screens cov-

ering an 180 degrees ﬁeld of view, a real adjustable

car seat, a Fanatec steering wheel with force feed-

back, gas, brake, and clutch pedals, and a gear lever

(Fig 2a). A high-ﬁdelity sound system was installed

to produce a realistic and immersive sound, together

with a low-frequency speaker placed under the seat

to simulate car vibrations. We used the open-source

CARLA simulator software (Dosovitskiy et al., 2017)

that we adapted to our needs by enabling manual and

automated driving. A screen showing the navigation

map was placed on the right-hand side of the driver.

The driving scenarios were executed in a closed-loop

highway environment depicted in Figure 1b. To sim-

ulate ADAS systems, LKA and ACC were imple-

mented in a slightly different way compared to real

cars. Regarding LKA, it only worked on one of the

four highway lanes. Regarding ACC, the minimum

speed limit to activate it was 50 km/h. To turn on the

driving assistance, a driver had to press a button on

the steering wheel. To adjust the speed of ACC, there

were two buttons on the steering wheel to increase or

decrease the speed by 5 km/h. In sum, the ADAS sys-

tems did not replicate those intended for road use, but

only resembled them.

3.5 Wizard of Oz

We used the Wizard of Oz (WOz) method involving

two wizards to simulate the VA. Two human wizards

worked together on analyzing driver’s utterances and

generating output. One wizard was in charge of gen-

erating feedback utterances after the obstacle avoid-

ance, while the other was in charge of generating ut-

terances related to teaching about the driving assis-

tance. The wizards were located in the same room

as the simulator, but separated from it by sound-proof

panels. We installed a microphone close to the steer-

ing wheel to hear what the participants were saying.

Wizards wore headsets during the evaluation and each

worked on their own workstation to generate the ap-

propriate output. When in doubt, wizards consulted

each other to clear out misunderstandings or to agree

on the output that was to be selected.

The wizards listened to participants’ utterances

and then selected the appropriate response in the wiz-

ard’s user interface. A text-to-speech (TTS) sys-

tem produced the spoken outputs using a male voice

with a British accent. We followed a proactive ap-

proach when designing our VA similarly to Schmidt

et al. (2020), where the VA would provide warnings

or feedback without participant’s request. The goal

was to explore how drivers would interact with the VA

to learn about the limitations and capabilities of the

LKA and ACC. Participants were given the follow-

ing three tasks explaining what the the voice assistant

could do: (1) Explain driving assistance features such

as LKA and ACC; (2) Advise how to use the driv-

ing assistance systems; (3) Explain the decisions the

car is making. In addition, we wrote a set of prede-

ﬁned messages for the two wizards related to the feed-

back and questions we assumed would be asked the

most (see Data Availability section for a full dataset

of WOz messages). The types of predeﬁned messages

to respond to driver’s requests included responses

about: ACC/LKA activation/deactivation, ACC/LKA

descriptions and purpose, Lane change/steering capa-

bilities, Non-supported questions (e.g. navigation, re-

minders, appointments, weather).

Upon arrival, participants signed a consent form

and were given explanations about study goals and

data collection. Then, we administered the demo-

graphic survey (age, gender, driving experience, and

experience with ADAS systems) and interviewed par-

ticipants about their experience of learning how to

use ADAS. Afterwards, participants completed a 10-

minute test drive in manual driving mode to get famil-

HUCAPP 2022 - 6th International Conference on Human Computer Interaction Theory and Applications

iar with the simulator. Then, we used a slideshow pre-

senting the rules to follow while driving and instruc-

tions about the simulator and the VA. The slideshow

included a prerecorded narration to avoid between-

participants variability in the explanation of the rules

and instructions about the test procedure. Thus,

all participants received the same information before

starting the experiments. This consideration ensures

the internal validity when studying driving assistance

systems (Schn

oll, 2021). Engagement and disengage-

ment criteria of the driving assistance, together with

braking and steering interventions were explained.

However, no detailed instructions about the take-over

procedure were given. Then, the Scenario 1 started,

followed by Scenario 2. After each scenario, we ad-

ministered a knowledge quiz followed by the ques-

tionnaires assessing trust and quality of the VA. All

sessions were video-recorded. Finally, we conducted

a semi-structured interview and closed the session.

3.6 UX Measures

We combined relevant questionnaires identiﬁed in

the literature and a custom-made quiz to collect data

about the UX with the VA. We used the SASSI ques-

tionnaire (Hone and Graham, 2000) to evaluate the

VA, as it provides the highest coverage of UX dimen-

sions and it is a recommended instrument to evaluate

speech-based interfaces (Kocaballi et al., 2018) and

is best-rated in terms of face validity (Br

uggemeier

et al., 2020). SASSI consists of 34 items rated on a

7-point rating scale and distributed between six fac-

tors: system response accuracy, likeability, cognitive

demand, annoyance, habitability and speed.

We evaluated trust using a questionnaire by Mad-

sen & Gregor (2000) initially designed to evalu-

ate trust between humans and automated systems

supporting decision-making and providing advice to

users. The instrument consists of ﬁve dimensions:

perceived reliability, perceived technical competence,

perceived understandability, faith and personal attach-

ment. Items were rated using a 5-point Likert-type

scale.

We created and used a knowledge quiz to assess

the accuracy of participants’ mental model of the driv-

ing assistance system with respect to the amount of

declarative knowledge they retained or gained during

the simulation. We administered the knowledge quiz

after each condition in iterations 1, 2 and 4, the quiz

containing 10, 11 and 14 statements respectively (Ta-

ble 1). The possible answers were ”True”, ”False”

and ”I don’t know”. Both wrong answers and ”I don’t

know” were considered incorrect answers. In itera-

tion 4, each statement used a Likert-type response for-

mat, ranging from Strongly Disagree (1) to Strongly

Agree (6), without the neutral position to encourage

precise answers from participants. We treated items

individually as interval data, and not as a Likert scale.

Thus, we do not compare the quiz results of iterations

1-2 to iteration 4. In iteration 4, we administered the

knowledge quiz twice: immediately after the simula-

tion and two weeks later, remotely, without partici-

pants coming to drive in the simulator again.

4 RESULTS

4.1 Driver’s Learning - Mental Model

Knowledge quizzes in iterations 1 and 2 had 10 and 11

questions respectively, thus we calculated the mean

scores of correct answers and converted them to per-

centages. Overall, participants scored better on the

quiz after Scenario 2 (Table 2). The quiz scores were

higher in iteration 2, indicating a more accurate men-

tal model. Due to the nature of formative evaluations

with a small participants sample, we cannot directly

attribute the higher score to the presence of the VA.

Q3, Q6, and Q8 had the lowest number of correct an-

swers in iteration 1. Regarding Q3 and Q6, partici-

pants could not answer correctly because they did not

see those events happening, nor did they ask the VA

about it. Although AEB was implemented to stop be-

fore a car and a truck, no participant let that happen.

Regarding Q8, none of the participants asked the VA

regarding the minimum speed to activate ADAS, nor

did we implement a speciﬁc indication to warn the

drivers about that. All other questions covered situa-

tions that happened in test scenarios and participants

could draw conclusions based on that.

To evaluate the mental model of participants in

iteration 4, we divided the statements from the quiz

in two categories: true statements and false state-

ments. The differences in the answers to the quiz

between two scenarios were very small (Figure 3).

The paired t-test also showed no statistically signif-

icant differences. The mean quiz scores in Scenario

1 (M=4.02) and scenario 2 (M=3.89) indicate that

there was no difference in the understanding of the

ADAS between two scenarios (t(13) = 0.822, p=.42,

two-tailed). However, for better interpretation of quiz

results, we calculated the closeness score of partic-

ipants’ mental model representation of ADAS (per-

ceived representation) to the actual functioning of

ADAS (actual representation). The closer the average

score for each question in the quiz was to the target

score of 1 or 6, the more accurate we considered their

mental model representation to be.

Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz Study

Table 1: Knowledge quiz statements used in Iteration 1, 2 and 4 and related correct answers.

ID IT Question Answer

Q1 1 When ADAS is activated, the steering wheel moves when the car is turning False

Q2 1,2,4 The system can detect (static) objects on the road (barriers, trash, etc.) False

Q3 1 The system can detect other cars True

Q4 1 When the assistant is activated, the system can handle turns True

Q5 1,2,4 I can activate LKA and ACC separately False

Q6 1 The system can detect trucks True

Q7 1 When the assistant is activated, the car is not able to avoid road works True

Q8 1 The assistant will work only if I drive faster than 50 km/h True

Q9 1 When the assistant is activated, the icon turns white False

Q10 1 The ACC helps me to stay in the lane False

Q11 1,2 To deactivate the driving assistance, I can brake True

Q12 2,4 The Automated Emergency Braking activates with stationary and moving vehicles True

Q13 2 The car can detect road works False

Q14 2,4 The car is able to avoid accidents by braking if the car in front of it is too close True

Q15 2,4 The ACC is not able to adapt the car’s speed with stationary vehicles False

Q16 2 The driving assistant can be turned on at 30kph False

Q17 2 The Forward Collision Warning is able to only detect moving vehicles False

Q18 4 There are potential problems to stay in lane in sharp turns True

Q19 4 Automated driving might not work well in the rain True

Q20 4 Automated driving might not work well in the fog True

Q21 4 There are potential problems to recognize pedestrians on the road True

Q22 4 The warning alerts are given for any type of obstacle on the road False

Q23 4 The car will come to a safe stop if I do not react to a takeover request False

Q24 4 I need to take over every time the take over request occurs True

Q25 4 Automated driving system requires me to constantly monitor the road True

Q26 4 Automated driving system is only available on the highway with multiple lanes True

(a) Scores for true quiz statements. (b) Scores for false quiz statements.

Figure 3: Mental model assessment mean scores. 2 weeks=mean score after 2 weeks.

Table 2: Knowledge quiz results for iterations 1 and 2.

IT=iteration; SC=Scenario; IDNK=I do not know.

IT SC Mean SD IDNK Score/100 N

1 S1 5,8 1,8 2,4 53 6

1 S2 6,3 0,8 1,5 57 6

2 S1 6,2 1,79 1,2 62 5

2 S2 7,4 1,34 0 74 5

We developed this measure to check the accu-

racy of participants knowledge in relation to the tar-

get responses based on the truthfulness of the state-

ments. Thus, the target for true statements was 6

(Strongly Agree) and the target for false statements

was 1 (Strongly Disagree). The lower the closeness

value the better. The closeness of true statements was

HUCAPP 2022 - 6th International Conference on Human Computer Interaction Theory and Applications

Figure 4: Mean trust scores per iteration and scenario.

slightly lower in Scenario 1 (2.03) than in Scenario 2

(2.17). Overall, Fig. 3a shows the mean scores be-

tween the two scenarios were relatively unchanged. It

appeared to be more difﬁcult for the participants to

determine whether false statements were true or false

(Fig. 3b). The closeness score after Scenario 2 (3.00)

shows higher accuracy compared to Scenario 1 (3.12).

At this stage, we assume false statements were

not answered as accurately because participants were

not always able to experience those situations in the

driving scenarios. Although it was theoretically pos-

sible to test those situations, false statements cov-

ered borderline situations that would require partici-

pants to compromise driving safety, e.g. waiting to

see whether the car will detect static objects on the

road (Q2). Similarly, for Q22 and Q23, it was not

possible to build the accurate mental model. Interest-

ingly, Q5 has the highest closeness discrepancies. To

sum up, the high closeness score of false statements

indicates the importance of addressing untrue behav-

iors and borderline situations in the teaching content

to improve the users’ mental model representation.

These issues represent the work for future develop-

ments. Two weeks later, we measured participants’

mental model accuracy again. True statements were

rated with even better closeness to the target, although

just for a fraction, scored at 1.93. The false statements

were rated similarly as well compared to the original

score, with the closeness score of 3.5. Therefore, we

may assume that the accuracy of the mental model is

robust in people’s minds two weeks after the use with-

out being exposed to the system at all.

4.2 Trust in ADAS

Trust increased in Scenario 2 in iteration 1 across all

subscales (Figure 4), compared to Scenario 1 when

VA was not available. Perceived Understandability,

measuring how well a user can form a mental model

of the system, scored particularly well in both condi-

tions. However, the results of the quiz (2) conﬁrm

that they reﬂect participants’ perceived understand-

ability and not the actual one. This could be due to

participants’ learned trust affected by their preexisting

knowledge, as explained by Hoff and Bashir (2015).

Most participants formed expectations from the sys-

tem based on their previous interactions with ADAS

in real-world cars. In our study, some driving assis-

tance features were not present, such as that the steer-

ing wheel was not moving in turns. In other iterations,

the trust scores did not change between scenarios, in-

dicating that the choice of tasks to complete or goals

to achieve does not affect trust signiﬁcantly.

4.3 Voice Assistant Evaluation

The SASSI system response accuracy score indicates

that the system scored well on recognizing correctly

users’ input and that it performed as they intended

and expected (Table 3). This is expected since hu-

man operators (wizards) interpreted the user’s input.

Likeability had the highest score amongst all fac-

tors, which indicates that users had a positive atti-

tude towards the system. Regarding cognitive de-

mand, it seems that driving and interacting with the

VA at the same time was quite demanding for drivers.

This could potentially be improved by revising the

length of the teaching content or by introducing an-

other modality to convey the knowledge, i.e. visuals.

Annoyance was rated low (2.96), which implies a pos-

itive user experience. Habitability tells us that there

is a relatively good match between the user’s mental

model of the VA and the actual system. Finally, speed

was rather low (4.31), presumably because wizards

were not always able to produce the output in a short

time frame. Participants conﬁrmed that in interviews.

4.4 Discussion and Lessons Learned

We used the log ﬁles from the wizard’s TTS interface

and manually extracted participants’ utterances from

the video recordings to build the interaction corpus.

Table 4 shows the summary of learning objectives that

the teaching content for the VA should address. We

believe these guidelines will be helpful for VA design-

ers and future studies related to teaching about ADAS.

From each video recording, we aggregated partici-

pants’ statements in interviews and extracted the rel-

evant comments and reﬂections from them. Using the

afﬁnity diagramming technique, we categorized these

statements. Below we give short summaries of our

ﬁndings across all four iterations.

Regarding teaching content, participants agreed

that the content was clear and useful. They empha-

sized the usefulness of the VA when driving a car for

the ﬁrst time, such as when buying a new one or us-

ing a rental car. Nevertheless, there should be a way

to skip some parts of the learning process. Some par-

Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz Study

Table 3: Mean scores per construct and overall from SASSI questionnaire rated on a 7-point rating scale.

Iteration Condition SRA LKA CD ANO HAB SP Overall SASSI N

1 S2 4,18 5,11 4,68 3,32 4,9 3,9 4,35 6

2 S1 3,82 5,57 4,96 3,1 4,27 3,75 4,25 5

3 S1 4,6 6,2 5,08 3,4 4,25 4,9 4,74 5

3 S2 4,51 5,89 5,36 3,36 4,85 4,9 4,81 5

4 S1 4,42 5,78 5 2,4 4,4 4,2 4,37 6

4 S2 4,27 5,91 4,92 2,2 4 4,2 4,25 6

All All 4,3 5,74 5 2,96 4,45 4,31 4,46

Table 4: List of learning goals to address in the design of the teaching content.

Domain Category Description

ADAS Explanation Provide descriptions about ADAS systems available

ADAS Automation Explain how driving can be automated using LKA and ACC

ADAS Takeover Describe when the takeover request is issued

ADAS Activation Explain how to activate driving automation

ADAS Limitations Describe the operational boundaries of the driving automation system

ADAS Takeover Describe the takeover procedure and how to disable driving automation

ADAS Speed regulation Explain how the car speed is set and regulated

ADAS Controls Explain the use of buttons to control ADAS to the driver

ADAS Availability Justify why driving automation is/is not available

ADAS Availability Inform the driver when driving automation is/is not available

ADAS Speed Explain how the ACC speed adaptation works

Driver Monitoring Explain what happens when driver does not react to takeover requests

Driver Role Explain the driver’s role while automation is on

User Interface Icons Explain the placement and meaning of automation status icons

VA Usefulness Inform the driver about what the VA is capable of doing

VA Navigation Provide explanation for why a certain route is calculated as best

ticipants thought that this information could be dis-

tributed without having to drive or at the red light.

Regarding participants’ mental model, we con-

cluded that most of them did not build an accurate

understanding of ADAS. In iterations 1 and 2 partici-

pants thought LKA was not available or that it could

not handle turns because the steering wheel did not

physically rotate in turns. There was no trait that

would provide a cue in the user’s environment to sig-

nal the functioning of the LKA. This resulted in a

behavior where participants kept turning the steering

wheel despite the driving assistance being switched

on. To illustrate, P4 said: ”There was only cruise

control and there was no lane assist. Or I don’t ﬁnd

it or I don’t understand it but I don’t feel it”. Further-

more, in iteration 1, some participants claimed that

ACC was not adaptive because they did not get to try

it while driving. Since the lane of the driver’s vehicle

was empty (Fig.1a), participants did not try to posi-

tion their car on the left lanes that carried trafﬁc (L1

and L2). Participants could not determine the capabil-

ities of obstacle detection and avoidance accurately.

The reason could be that there were not enough situa-

tions to analyse the car’s behavior regarding obstacle

detection as we only had two obstacles per scenario.

The most common complaint was about the VA’s

response time. We expected this to occur when the

wizards faced a request for which they did not have

the answer prepared. However, half of the partici-

pants expressed that they had issues identifying what

to ask the VA, although they received instructions. In

interviews, some participants revealed that it was dif-

ﬁcult to understand what the assistant could help them

with. For example, P3 said ”It was not easy to ﬁnd

questions let’s put it that way”. P4 said: ”Yes, very

useful but we have to know what kind of information

we can ask the system”. It may be that participants

felt they were not supposed to learn about the system

while driving. As P1 said: ”I don’t think that when

you are in the car, you should learn the theory about

how it works. It is about the practical part when you

are driving”. This suggests that learning by doing

might be a better approach than providing the list of

tasks the VA supports.

Participants often questioned the usefulness of the

VA in the long-term. Although they generally liked

the availability of such assistance, they raised the

question of its long-term usage. Obviously, the ap-

plication of the VA to learning is a short-term goal

and its features should be extended to other domains,

HUCAPP 2022 - 6th International Conference on Human Computer Interaction Theory and Applications

such as trafﬁc information as suggested by P1. P3

said ”Something that came through my mind, it’s a

little bit to replace the manuals, which are boring”,

which is an existing idea and one of the goals of this

study too. P6 said that receiving feedback about the

use of driving assistance was not needed; ”But when I

avoided the accident, the voice told me “ok you man-

aged to avoid the thing”. This was obvious. What is

the added value of telling me I did a good job?”.

In iteration 2, participants seldom interacted with

the VA. Conversations were simple and were con-

structed as a short question and answer. The VA

started conversations most times and drivers would

shortly respond. Although participants were provided

with learning goals, they rarely completed them, in

which case wizards decided to proactively start con-

versations or provide teaching content without de-

mand. P4 said it was easy to focus both on the road

and the spoken content, while P5 commented that

talking to the car was not a usual practice. In itera-

tion 3, however, participants talked even less with the

VA, stating a lack of explicit tasks or reason to do so

as a cause. All six participants said it was unclear that

they had to start the conversation and think of ques-

tions to ask about ADAS. Most times, the VA initi-

ated the interaction and consisted of warning alerts

and warning messages about takeover requests. How-

ever, when participants did speak to the VA, they tried

to request the VA to remind them when to take the turn

or to activate ADAS. In iteration 4, the introduction of

tasks produced more interaction between drivers and

the VA and it seems a promising teaching approach.

5 CONCLUSION

This paper contributes to the topic of the design of

conversational agents aimed at driver tutoring about

driving assistance systems in vehicles. We compiled a

list of learning objectives a teaching assistant should

support. Our ﬁndings suggest that drivers welcome

the idea of a VA for such purposes. However, the pur-

pose, tasks, and functions of such assistants have to be

clear to drivers to increase its adoption. Although we

provided explanations and limitations of the VA, par-

ticipants still did not know how exactly they should

interact with the assistant, despite their insufﬁcient

understanding of the implemented driving assistance.

Nevertheless, participants seemed to be eager to ex-

plore the possibilities of the VA and were favorable

towards the idea of having such a system in their

car. However, we believe that this was due to unclear

speciﬁcations of the purpose and functionalities of the

VA. Additionally, the knowledge retention rates re-

mained high two weeks after the learning took place.

This indicates that using speech to provide declara-

tive knowledge to drivers might be effective. How-

ever, further research is necessary to establish clear

design guidelines as well as to focus on the learn-

ing and retention of procedural knowledge in various

driving situations. Finally, the transfer of knowledge

in real-world scenarios is another challenge that needs

to be addressed.

ACKNOWLEDGEMENTS

This research has been supported by the Service Pub-

lic de Wallonie (SPW) under the reference 7982.

REFERENCES

Boelhouwer, A., van den Beukel, A. P., van der Voort,

M. C., and Martens, M. H. (2019). Should I take

over? Does system knowledge help drivers in mak-

ing take-over decisions while driving a partially auto-

mated car? Transportation Research Part F: Trafﬁc

Psychology and Behaviour, 60:669–684.

Boelhouwer, A., van den Beukel, A. P., van der Voort,

M. C., Verwey, W. B., and Martens, M. H. (2020).

Supporting drivers of partially automated cars through

an adaptive digital in-car tutor. Information, 11(4).

uggemeier, B., Breiter, M., Kurz, M., and Schiwy, J.

(2020). User Experience of Alexa when controlling

music: Comparison of face and construct validity of

four questionnaires. In Proceedings of the 2nd Con-

ference on Conversational User Interfaces, CUI ’20,

New York, NY, USA. Association for Computing Ma-

chinery.

Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., and

Koltun, V. (2017). CARLA: An Open Urban Driving

Simulator. (CoRL):1–16.

Dow, S., MacIntyre, B., Lee, J., Oezbek, C., Bolter, J. D.,

and Gandy, M. (2005). Wizard of oz support through-

out an iterative design process. IEEE Pervasive Com-

puting, 4(4):18–26.

Forster, Y., Hergeth, S., Naujoks, F., Krems, J., and Keinath,

A. (2019). User education in automated driving:

Owner’s manual and interactive tutorial support men-

tal model formation and human-automation interac-

tion. Information (Switzerland), 10(4).

Forster, Y., Hergeth, S., Naujoks, F., Krems, J. F., and

Keinath, A. (2020). What and how to tell before-

hand: The effect of user education on understanding,

interaction and satisfaction with driving automation.

Transportation Research Part F: Trafﬁc Psychology

and Behaviour, 68:316–335.

Frison, A.-K., Wintersberger, P., Oberhofer, A., and Riener,

A. (2019). ATHENA: Supporting UX of condition-

ally automated driving with natural language reliabil-

ity displays. In Proceedings of the 11th International

Teaching Drivers about ADAS using Spoken Dialogue: A Wizard of Oz Study

Conference on Automotive User Interfaces and Inter-

active Vehicular Applications: Adjunct Proceedings,

pages 187–193.

Hassenzahl, M. (2018). The thing and I: understanding the

relationship between user and product. In Funology 2,

pages 301–313. Springer.

Hoff, K. A. and Bashir, M. (2015). Trust in automation: In-

tegrating empirical evidence on factors that inﬂuence

trust. Human Factors, 57(3):407–434.

Hone, K. S. and Graham, R. (2000). Towards a tool for

the Subjective Assessment of Speech System Inter-

faces (SASSI). Natural Language Engineering, 6(3-

4):287–291.

Kasuga, N., Tanaka, A., Miyaoka, K., and Tatsuro, I.

(2020). Design of an HMI System Promoting Smooth

and Safe Transition to Manual from Level 3 Auto-

mated Driving. International Journal ITS Research,

18:1–12.

Kim, J. W., Ritter, F., and Koubek, R. (2013). An integrated

theory for improved skill acquisition and retention in

the three stages of learning. Theoretical Issues in Er-

gonomics Science, 14(1):22–37.

Kocaballi, A. B., Laranjo, L., and Coiera, E. (2018). Mea-

suring User Experience in Conversational Interfaces:

A Comparison of Six Questionnaires. In Proceedings

of the 32nd International BCS Human Computer In-

teraction Conference, HCI ’18, Swindon, GBR. BCS

Learning; Development Ltd.

Koo, J., Shin, D., Steinert, M., and Leifer, L. (2016). Un-

derstanding driver responses to voice alerts of au-

tonomous car operations. International journal of ve-

hicle design, 70(4):377–392.

Kraus, M., Wagner, N., and Minker, W. (2020). Effects

of Proactive Dialogue Strategies on Human-Computer

Trust. In Proceedings of the 28th ACM Conference

on User Modeling, Adaptation and Personalization,

UMAP ’20, page 107–116, New York, NY, USA. As-

sociation for Computing Machinery.

Larasati, R., Liddo, A. D., and Motta, E. (2020). The effect

of explanation styles on user’s trust. In 2020 Work-

shop on Explainable Smart Systems for Algorithmic

Transparency in Emerging Technologies.

Madsen, M. and Gregor, S. (2000). Measuring Human-

Computer Trust. In Proceedings of Eleventh Aus-

tralasian Conference on Information Systems.

Merat, N., Seppelt, B., Louw, T., Engstr

om, J., Lee, J. D.,

Johansson, E., Green, C. A., Katazaki, S., Monk, C.,

Itoh, M., McGehee, D., Sunda, T., Unoura, K., Vic-

tor, T., Schieben, A., and Keinath, A. (2019). The

“Out-of-the-Loop” concept in automated driving: pro-

posed deﬁnition, measures and implications. Cogni-

tion, Technology and Work, 21(1):87–98.

Merriman, S. E., Plant, K. L., Revell, K., and Stanton, N.

(2021). Challenges for automated vehicle driver train-

ing: A thematic analysis from manual and automated

driving. Transportation Research Part F: Trafﬁc Psy-

chology and Behaviour, 76:238–268.

Payne, S. J. (2012). Mental models in human–computer

interaction. In The Human–Computer Interaction

Handbook, pages 41–54. CRC Press.

Politis, I., Langdon, P., Adebayo, D., Bradley, M., Clark-

son, P. J., Skrypchuk, L., Mouzakitis, A., Eriksson, A.,

Brown, J. W. H., Revell, K., and Stanton, N. (2018).

An Evaluation of Inclusive Dialogue-Based Interfaces

for the Takeover of Control in Autonomous Cars. In

23rd International Conference on Intelligent User In-

terfaces, IUI ’18, page 601–606, New York, NY, USA.

Association for Computing Machinery.

Rossi, R., Gastaldi, M., Biondi, F., Orsini, F., De Cet, G.,

and Mulatti, C. (2020). A driving simulator study ex-

ploring the effect of different mental models on adas

system effectiveness. In De Paolis, L. T. and Bourdot,

P., editors, Augmented Reality, Virtual Reality, and

Computer Graphics, pages 102–113, Cham. Springer

International Publishing.

Rukonic, L., Mwange, M.-A., and Kieffer, S. (2021). UX

Design and Evaluation of Warning Alerts for Semi-

autonomous Cars with Elderly Drivers. In Proceed-

ings of the 16th International Joint Conference on

Computer Vision, Imaging and Computer Graphics

Theory and Applications - Volume 2: HUCAPP, pages

25–36.

SAE (2018). Taxonomy and deﬁnitions for terms related

to on-road motor vehicle automated driving systems.

SAE Standard J3016. USA.

Schmargendorf, M., Schuller, H.-M., B

ohm, P., Isemann,

D., and Wolff, C. (2018). Autonomous driving and the

elderly: Perceived risks and beneﬁts. In Dachselt, R.

and Weber, G., editors, Mensch und Computer 2018

- Workshopband, Bonn. Gesellschaft f

ur Informatik

e.V.

Schmidt, M., Minker, W., and Werner, S. (2020). User

Acceptance of Proactive Voice Assistant Behavior.

In Wendemuth, A., B

ock, R., and Siegert, I., edi-

tors, Studientexte zur Sprachkommunikation: Elektro-

nische Sprachsignalverarbeitung 2020, pages 18–25.

TUDpress, Dresden.

Schn

oll, P. (2021). Inﬂuences of Instructions about Opera-

tions on the Evaluation of the Driver Take-Over Task

in Cockpits of Highly-Automated Vehicles. In 5th

International Conference on Human Computer Inter-

action Theory and Applications, Vienna, Austria. IN-

STICC, SCITEPRESS.

Sørmo, F., Cassens, J., and Aamodt, A. (2005). Ex-

planation in Case-Based Reasoning–Perspectives and

Goals. Artiﬁcial Intelligence Review, 24:109–143.

Walker, F., Boelhouwer, A., Alkim, T., Verwey, W. B., and

Martens, M. H. (2018). Changes in Trust after Driving

Level 2 Automated Cars. Journal of Advanced Trans-

portation, 2018.

Young, K. L., Koppe, S., and Charlton, J. L. (2017). Toward

best practice in human machine interface design for

older drivers: A review of current design guidelines.

Accident Analysis & Prevention, 106:460 – 467.

HUCAPP 2022 - 6th International Conference on Human Computer Interaction Theory and Applications