Knowledge at First Glance: A Model for a Data Visualization
Recommender System Suited for Non-expert Users
Petra Kubern´atov´a
1
, Magda Friedjungov´a
2
and Max van Duijn
1
1
Leiden Institute of Advanced Computer Science, Leiden University, Netherlands
2
Faculty of Information Technology, Czech Technical University in Prague, Czech Republic
Keywords:
Data Visualization, Recommender System, Non-experts, Model.
Abstract:
In today’s age, there are huge amounts of data being generated every second of every day. Through data
visualization, humans can explore, analyse and present it. Choosing a suitable visualization for data is a
difficult task, especially for non-experts. Current data visualization recommender systems exist to aid in
choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. The aim of
this study is to create a model for a data visualization recommender system for non-experts that resolves
these issues. Based on existing work and a survey among data scientists, requirements for a new model
were identified and implemented. The result is a question-based model that uses a decision tree and a data
visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both
task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute
these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the
new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent.
The presented model can be applied in the development of new data visualization software or as part of a
learning tool.
1 INTRODUCTION
In today’s age, there are huge a mounts of data being
generated every second of every day and Big Data has
been one of the hot topics of computer science in re-
cent years. Being th e curious species that we are, hu-
mans are looking for ways to get the most information
out of this vast amount of data that we have available
at our fingertips. We are always looking for method s
to help us explore, analyze and present it.
A crucial p art of th is process is data visualization.
Data visualization is the representa tion of information
in a visual form, such as a c hart, diagram or picture. It
can find its place in a variety of areas such as art, mar-
keting, social relations and scientific research. There
were over 300 visualization types available at the time
of writing this pape r ( Bostock, 2017). But how do we
choose the most suitable one? This is where data vi-
sualization recommender systems come in: these sy-
stems help with this difficult task that becomes even
more difficult when the u ser is a non-expert.
In this paper we define a ’non-exper t user’ as so-
meone without professional or specialized knowledge
of da ta visualization. We thus include both complete
beginners and users who have general knowledge of
data visualization types (e.g. bar charts, pie charts,
scatter plots) but have no professional experience in
the fields of data science and data communication.
In this study we focus on building a model for
a data visualizatio n recommender system aimed at
non-expert users. We term our model NEViM: Non-
Expert Visualization Model.
Section 2 of this paper pla c es data visualization
recommender systems for non-experts in the context
of data science. We discuss different types of systems
and comme nt on where the model we are building fits
in. Section 3 introduces our research goal and the
method we intend to use to fulfill it. Section 4 discus-
ses the results of the work done within our method.
We present results of our literatu re study, existing so-
lutions analysis, survey, model requirements, model
construction process and model testing process. We
draw conclusions in Section 5 and set an agenda fo r
future work in Section 6.
208
Kubernátová, P., Friedjungová, M. and Duijn, M.
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users.
DOI: 10.5220/0006851302080219
In Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), pages 208-219
ISBN: 978-989-758-318-6
Copyright © 2018 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
2 CONTEXT
2.1 Data Science
Data science plays an important role in scientific re-
search, as it aids us in collecting, organizing, and in-
terpreting data, so tha t it can be transformed into va-
luable knowledge.
Communicate
Results
Machine Learning
Algorithms
Statistical Models
Exploratory Data
Analysis
Clean Data
Data is
Processed
Raw Data is
Collected
Figure 1: The data science process (O’Neil and Schutt,
2014).
Figure 1 shows a simp lified diagr a m of the data
science proc e ss. First, real world raw data is col-
lected, processed and cleaned through a process cal-
led da ta munging. Then exploratory data analysis
(EDA) follows, d uring which we might find that we
need to collect mor e data or dedicate more time to
cleaning and organizing the curre nt dataset. When
finished with E DA, we may use machine learning
algorithm s, statistical models and data visualization
techniques, depending on the type of problem we are
trying to solve. Finally, results can be c ommunicated
(O’Neil and Schutt, 2014).
Our focus her e is on the part of the process con-
cerning exploratory data analysis or EDA. EDA uses
a variety of statistical techniques, principle s of ma-
chine learning, but also, crucially, the data visualiza-
tion techniques we study in this paper. Please note
that data visu a lization can also be a pa rt of the Com-
municate Results stage of the data scienc e process
(see Figure 1). There is a thin line between data vi-
sualizations made for exploration and ones made for
explanation, as most exploratory data visualizations
also contain some level of explanation an d vice-versa.
2.2 Exploratory Data Analysis
Explora tory data analysis (EDA) is not only a criti-
cal part of the data science process, it is also a kin d
of philosophy. You are aiming to understand the data
and its shape and connect your understanding of the
process that collec te d the data with the data itself.
EDA helps with suggesting hypotheses to test, eva-
luating the quality of the data, identifying potential
need for further collection or cleaning, supporting the
selection of appropr ia te models and techniques and,
most importantly for the con text of this study, it helps
find interesting insights in your data (Tukey, 1970).
2.3 Data Visualization
There are many definitions of the term data visualiza-
tion. The one used in this study is: data visualization
is the representation and presentation of data to faci-
litate u nderstanding. According to Kirk, our eye and
mind are not equip ped to easily tran slate the textual
and numeric values of raw data into quantitative and
qualitative meaning. ”We can look at the data, but we
cannot understand it. To truly understand the data, we
need to see it in a different kind of form. A visual
form. (Kirk, 2016 )
Illinsky and Steele describe data visualization as a
very p owerful tool for identifying patterns, communi-
cating relation ships and meaning, insp iring new q ue-
stions, identifying sub-problems, identifying trends
and outliers, discovering or sear ching for interesting
or specific data points (Illinsky and Steele, 2011).
Tamara Munzner made a 3-step model for data vi-
sualization design. A ccording to this mod e l, we first
need to decide what we want to show. Secondly, we
need to motivate why we want to show it. Finally, we
need to decide how we a re going to show it (Mun-
zner and Maguire, 2015). There are many different
types of data visu alizations to help us with the third
step. However, the challenge rem a ins in choosing the
most suitable one. Data visualization recommender
systems were made to help with this d ifficult task. We
find that the WHAT and the WHY greatly influence
the HOW, thus we aim to build a system that r eflects
all three aspects o f the data visualization de sign pro-
cess in some way.
2.4 Data Visualization Recommender
Systems
Within this study we defin e data visualization recom-
mender systems as tools that seek to rec ommend visu-
alizations which highlight features of interest in data.
This definition is based on combining common as-
pects of definitions in existing work.
While the output of data visualization recommen-
der systems is always a recommendation for data vi-
sualization types in some shape or fo rm, the input can
differ. It can be, f or example, just the data itself, a
specification of goals or the specification of aesthe-
tic prefere nces. The type of input affects the type of
recommendation strategy used and conseq uently the
type of the recommender system.
Kaur and Owonibi distinguish 4 types of recom-
mender systems (Kaur and Owonibi, 2017):
Data Characteristics Oriented. These systems
recommend visualizations based on data charac-
teristics.
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users
209
Task Oriented. These systems recomm e nd visu-
alizations based on representa tional goals as well
as data characteristics.
Domain Knowledge Oriented. Th ese systems
improve the visualization recommendation pro-
cess w ith domain knowledge.
User Preferences Oriented. These systems gat-
her information abo ut the user presentation goals
and preferences through user interaction with the
visualization system.
The line be tween different categories of recom-
mendation systems is rather thin and some systems
can have ambiguous classifications, as will be discus-
sed below.
3 METHOD
Within this study our aim is to devise a new data vi-
sualization recommender system, which is simple and
easy to use for non-experts, but c a n nonetheless com-
pete w ith existing, often mo re complex systems. Cle-
arly, we will avoid reinventing the whee l: the current
solutions are already good, but we want to see if we
can make adjustments that make a system more suit-
able for n on-expert users while maintaining effecti-
veness (still clearly distinguishing the data visualiza-
tions from each other) and performance (recommen-
ding the most suitable visualiza tion type).
We w ill begin by conducting a literatu re study
of previous work done in the field of data visua-
lization recommender systems. We focus on data
characteristics-oriented and ta sk-oriented data visua-
lization recommender systems, as this is where our
model belongs. The study helps us identify aspec ts of
current solutions which could be utilized in our mo-
del and determine which solutions are suitable for the
testing of our model.
Next, we run a survey among different data
science communities on Faceboo k and LinkedIn.
This way, we ask 88 responden ts who have some sor t
of familiarity with data science and its terminology.
The main goals of the survey are to aid us in decisi-
ons about our model and, as our model is aimed at
non-expert user s, to aid us in specifying who exactly
these users are.
The find ings we make from the literature study,
as well as the results of the survey will help us form
requirements for our model.
Once we have the requirements, we commence
constructing the model. First we choose a suitable
base stru cture. Then we establish the different com-
ponen ts of the structure and specify what they will be
in our model. Fin a lly, we combine it all together.
We perform two tests on the co nstructed model.
The first test focuses on establishing whether the mo-
del is able to pro duce results similar o r identical to
existing solutions. The second focuses on testing the
extendibility of the model by adding a new type of
visualization.
4 RESULTS
4.1 Existing Solutions Study
4.1.1 Data Characteristics Oriented systems
Systems based on data c haracteristics aim to improve
the understanding of the data, of different relations-
hips that exist within the da ta an d of procedures to re-
present them. Some of the following tools and techni-
ques are not rec ommend ation systems per se but they
were a crucial part of the history of this field and foun-
dations for other recom mender systems stated, thus
we feel it is appropriate to list them as well.
BHARAT
BHARAT was the first system that proposed some
rules for determining which type of visualization is
appropriate for certain data attributes (Gnana mgari,
1981). As this work was written in 1981, the set of
possible visualizations was not as varied as it is to-
day. The system incorporated only the line, pie and
bar charts. If the function was co ntinuous, a line chart
was recomme nded. If the user indicated that the range
sets could be summed u p to a me aningfu l total, a pie
chart was recommended and bar charts were recom-
mended in all the remainin g cases. Even though this
system would now be con sidered very basic, it served
as the foundation for other systems that followed.
APT
In 1986, Mackinlay proposed to f ormalize and co-
dify the graphical de sig n specification to automate the
graphics genera tion process (Mackinglay, 1986). His
work is based on the work o f Josep h Bertin , who, in
1983, ca me up with a semiology of graphics (Bertin,
1983), whe re he specified visual variables such as po-
sition, size, value, color, orientation etc. and classi-
fied th em according to which features they commu -
nicate best. Mackinlay cod ified Bertin’s semiology
into a lgebraic operators that were used to search for
effective presentations of informa tion. He based his
findings on the principals of expressiveness and ef-
fectiveness. Expressiveness is the idea that graphi-
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
210
cal presentations a re actually sentences of graphi-
cal lan guages and effectiveness ref ers to how accu-
rately these presentations are perceived. He would
take the encoding technique and formalize it with pri-
mitive graphical language (which data visua lizations
can show this), then he would order these primitive
graphica l languages using the effectiveness princ iple.
VizQL(Visual Query Language)
In 2003, Hanrahan revised M a ckinlay’s specifications
into a declarative visual language known as VizQL
(Hanraha n, 2006 ). It is a formal language for descri-
bing tables, charts, graphs, maps and time series. The
languag e is capable of translating actions into a data -
base q uery and then exp ressing the response gra phi-
cally.
Tableau and Its Show Me Feature
The introduction of Tableau was a real milestone in
the world of data visualization tools. Due to the sim-
ple user interface, even inexperienced users could cre-
ate data v isualizations. It was created when Stolte,
together with Hanrahan and Chabot, decided to com-
mercialize a system ca lled Polaris (Stolte et al., 2002)
under the name Tableau Software. In 2007 Tableau
introdu ced a feature called Show Me (Mackinlay
et al., 2007). The Show Me functionality takes advan-
tage of VizQL to automatically present data. At the
heart of this feature is a data characteristics-oriented
recommendation system. The u ser selects the data at-
tributes that interest him and Tableau recommends a
suitable visualization. Tableau determines the pro-
per visualization type to use by looking at the types
of attributes in the data. Each visualization requires
specific attribute types to be present before it ca n be
recommended. Furthermore , it also ranks every vi-
sualization o n familiarity an d design best practices.
Finally, it recommends the highest-ranked eligible vi-
sualization. Mackinla y and his team have also perfor-
med in te resting user tests with the Show Me fe ature.
They found that the Show Me feature is being used
(very) modestly by skilled users (i.e. in o nly 5.6% of
cases).
ManyEyes
Viegas et al. created the first known public we b-
site where users could upload data and create in-
teractive visualizations collaboratively: ManyEyes
(Viegas et al., 2007). Design choices were made to
reflect the effort to find a bala nce between powerful
data-analysis capabilities and accessibility to the non-
expert visualization user. The visualiza tions were cre-
ated by matching a dataset with one of the 13 types of
data visualizations implemented in the tool. They di-
vided th e data visualizations into groups by data sche-
mas. A data schema could be, for example, single co-
lumn textual data. Thus, a bar chart was described as
single column textual data a nd more than one nume-
rical value. The tool closed down in 2015.
Watson Analytics
Since 2 014, IBM have been developing a tool cal-
led Watson Analytics (IBM, 2017). Watson Analy-
tics uses principles of machine learning and natural
languag e processing to recommend users either que-
stions they can ask about their data, or a specific vi-
sualization. However, little is known about how the
recommendation system works.
Microsoft Excel’s Recommended Charts Feature
In the 2013 release of Microsoft Excel, a new feature
called Recommended Charts was introduced. The
user can select the data they want to v isu alize and Ex-
cel recom mends a suitable visualization (Microsoft,
2017). However, Microsoft does not share exactly
how this process is carried out, makin g it less suita-
ble as a source of inspiration.
SEEDB
In 2015 Vartak et al. proposed an eng ine called
SEEDB (Vartak et al. , 2015). They judge the inte-
restingness of a visualization based on the following
theory: a visualization is likely to be interesting if
it displays large deviations from some reference (e.g.
another dataset, historical da ta, or the rest of the data).
This h elps them identify the most interesting visua-
lizations from a large set of potential visualizations.
They identified that there are mo re aspects that de-
termine the interestingness of a visualization, su c h as
aesthetics, user pref e rence, metadata and user tasks.
A full-fledged visualization recommendation system
should take into a c count a combina tion of these as-
pects. A major disadvantage of SEEDB is that it only
uses variations of bar charts and line charts. As far as
we know SEEDB was never deployed.
Voyager
In 2016, Wongsuphasawat et al. developed a visuali-
zation recommendation web application called Voy-
ager (Wongsuphasawat et al., 2016), based on the
Compass recommendation engine ( Wongsuphasawat,
2017) and a high-level specification language called
Vega-lite (Satyanaray a n et al., 2017). It couples brow-
sing with visualization recommendation to support
exploration of multivariate, tabular data.
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users
211
Google Sheets and Its Explore Feature
Google Sheets (Google, 201 7) is a tool which allows
users to create, edit and sha re spreadsheets. It was
introdu ced in 2007 and is very similar to Microsoft
Excel. In June of 2017, the tool was extended with
the Explore Feature, which helps with automatic chart
building and data visualization. It uses e le ments of
artificial in telligence and natural language pr ocessing
to r ecommend users q uestions they might want to ask
about their data, as well as recommending da ta visua-
lizations that best suit their data. In the documentation
for this feature, Google specifies each of the inclu ded
data visualizations using functions and conditions th at
have to be fulfilled in order for tha t particular data vi-
sualization to be recommende d. However, a couple of
visualizations have the same conditions and it is not
revealed how the most suitable data visualization is
chosen.
4.1.2 Task Oriented Systems
Task-oriented systems aim to design different techni-
ques to infer the representational goal or a user’s in-
tentions. In 1990 Roth and Mattis were the first
to identify different domain-independent inf ormation
seeking goals, such as compa rison, distribution, cor-
relation etc. (Roth and Mattis, 1990). Also in 1990,
Wehrend and Lewis proposed a classification scheme
based on sets of representatio nal goals (Wehrend and
Lewis, 1990). It was in the form of a 2D m atrix where
the columns were data attributes, the rows representa-
tional goals and th e cells data v isu alizations. To find a
visualization, the user had to divide the problem into
subproblems, until for each subproblem it was possi-
ble to find an e ntry in the matrix. A representation for
the original complex problem could then be found by
combining the candidate representation methods for
the subproblems. Unfortunately, the complete matrix
was not published so it is unkn own which sp ecific ty-
pes of data visualizations were included.
IMPROVISE
In the previous studies, the user task list was manu-
ally created. However, in 1998, Zhou and Feiner in-
troduced advanced linguistic techniques to automate
the derivation of the user task from a natural language
query (Zhou and Feiner, 1998). They introduced a
visual task taxonomy to auto mate the process of gai-
ning presentation intents from the text. For example,
the visual task Focus implies that visual techniques
such as Enlarge or High light could be used. This taxo-
nomy is implemented in IMPROVISE. Zhou and Fei-
ner show how IMPROVISE generates a visual narr a-
tive from speech to p resent an overview of a hospital
patient’s information to a nurse. To achieve th is goal,
it constructs a structure diagram that organizes vari-
ous informatio n ( e .g. IV lines) around a core com-
ponen t (the patient’s body). In a top-down de sig n
manner, IMPROVISE first creates an ’empty’ struc-
ture diagram and then populates it with components
by par titioning and encoding the patient informatio n
into different groups.
HARVEST
In 2009 Gotz and Wen introduced a novel behavior-
driven approach (Gotz and Wen, 2009). Instead of
needing explicit task descriptions, they use impli-
cit task informatio n obtained by monitoring users’
behavior to make recomm endation more effective.
The Behavior-Driven Visualization Recommendation
(BVDR) approach has two phases. In the first phase
of BDVR, they detect four predefin e d patterns from
user activity. In the second phase, they fe ed the
detected patterns into a recommendation algorithm,
which infers user intent in terms of common visual
tasks (e. g. comparison) and suggests visualizations
that better support the user’s needs. The inferred vi-
sual task is used together with the properties of the
data to retrieve a list of poten tially useful visual me -
taphors from a visualization example corpus m a de by
Zhou and Chen (Zhou et al., 2002). It contains over
300 examples from a wide variety of sources. Unfor-
tunately, we were not able to access this c orpus.
All in all, we identify some pitfalls of the existing sy-
stems. Such as them not being accessible enough, too
complicated, too formal and too secretive when it co-
mes to their recommenda tion process. The biggest
pitfall is that the result of their recomm endation pro-
cess is most commonly a set of data visualization s,
which, in our opinion, leaves the users a bit further
than they started, but still nowhere, because they still
have to choose the most suitable visualization. The
possibilities have been narrowed, but a decision still
must be made. We hop e to avoid these pitfalls within
our model. We establish that we are going to test our
model against the solutions available to us. This me-
ans Tableau, Watson Analytics, Excel, Voyag e r and
Google Sheets. Please note that we are going to com-
pare aga inst the recommendation system features of
the tools, not the tools as a w hole.
4.2 Exploratory Survey
We run a survey among different data science commu-
nities on Facebook a nd LinkedIn. This way, we get
respond ents who have som e sor t of familiarity with
data science and its terminology. The main g oals of
the survey are to aid us in making decisions about our
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
212
model and specifying the term non-expert user.
4.2.1 Participants
In total, we gathered 88 valid responses (n=88). Out
of the 88 re spondents, 78% (n=69) we re male and
22% (n=19) female. The average age was 29.86 ye-
ars.
We had asked the respondents to indicate their
knowledge level on a scale of 1 to 10, 1 being be-
ginner and 10 being expert. The average knowledge
level was 5.70. We opted to divide the scale into three
ranges in the following way: 1-3 are beginners, 4-7
are non-experts and 8-10 are experts. According to
our ranges w e had 26% (n=2 3) beginner level, 44%
non-expert (n=39) level and 30% (n=26) expert level
respond ents.
4.2.2 Results
We make the following findings from the results of
our survey:
For all groups, the main purpose of making data
visualizations was for analysis (65% of beginners,
64% of non-experts, 58% of experts).
All types of users choose data visualizations
mainly according to: the characteristics of their
data (57% of beginners, 62% of non -experts, 65%
of exp erts) and the tasks that they want to perform
(48% of beginners, 51% of non-experts, 6 2% of
experts).
For all gro ups, the two most used visualizations
are bar charts (1 7% of beginn e rs, 38% of non-
experts, 3 5% of experts) and scatter plots ( 43% of
beginners, 26% of non-experts, 31% of experts).
All groups were mostly unable to name an ex-
isting data visualization recommen dation system
(0% able vs. 10 0% unable for beginners, 5% able
vs. 95% unable for non-exper ts and 4% able vs.
96% unable for experts).
All groups would be willing to use a data visuali-
zation recommenda tion system, although experts
were less willing than b eginners and non-experts
(100% willing vs. 0% not willing for beginners,
87% willing vs. 13% not willing for non-experts
and 77% willing vs. 23% not willing for experts).
To summarize, we have lea rned that no n-experts make
data visualizations mainly for the purpose of analy-
sis. When they select a suitable data visualization
type, they do so according to the c haracteristics of
their data and the tasks they want to perform. Their
most used visualization types are bar charts an d scat-
ter plots. They are not familiar with data visualization
recommender systems but are mostly willing to use
one. We also learned that there is not much difference
between the a pproaches of beginners, non- exp ert and
expert users, which was unexpected.
4.3 Model Requirements
Based on researc h of previous ap proaches to ou r pro-
blem and the results of our survey, we have identified
the following require ments which NEViM should ful-
fill:
1. Simplicity - The model should be simple enough
to be used by non-experts. It must have good flow
and a very straightforward base structure.
2. Clarity - We aim for the result of our recommen-
dation system to be one data visualization. Not a
set, like in some current tools. This means that the
underlying classification hierarchy of data visua-
lizations must be clear and unambiguous.
3. Versatility - We want our model to combine dif-
ferent kinds of recomm endation systems. From
our survey we learn that when users select a suit-
able data visua lization type, they do so based on
the characteristics of their data and the tasks th ey
want to perform. Based on this we incorporate a
data characteristics-oriented and task -oriented ap-
proach . Furthermore, we want our model to be
easily implemented in different programming lan -
guages and environments.
4. Extensibility - Our aim is for our mode l to be ea-
sily extendable. We want the process of adding
visualizations into the model to be simple. We
want it to be a useful skeleton which can be easily
extended to include automatic visua liza tions etc.
5. Education - We want o ur model to not on ly
function as a recommen der system, but also as a
learning tool.
6. Transparency - Once we recommend a visualiza-
tion, we want the users to see, why the particular
visualization was recommended, meaning that the
path to a visualization recommendatio n throug h
our model has to b e retraceable.
7. Self-learning - We want our model to be able to
improve itself. This means, amongst other things,
that it should be machine learning fr ie ndly.
8. Competitiveness - We want our model to still
produce results which are compa rable to results
from other systems.
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users
213
4.4 Constructing NEViM
4.4.1 What Base Structure to Use for NEViM?
Since the aim of our model is to help a user decide
which data visualiza tion to use, the obvious cho ic e
seemed to be the structure of decision trees. A d e -
cision tree has fou r ma in parts: a root node, inte rnal
nodes, leaf nodes and branches. T he biggest advan-
tages of decision trees are that they can help uncover
unknown altern ative solutions to a problem and that
they are well suited for machine learning methods.
Once we dete rmined th at the decision tree was a
possible base structure, we nee ded to specify what
our root node, interna l nodes, leaf nodes and bran-
ches would be. I t was clear that the leaf nodes would
be the different types of data visualizations since that
was the outcome that we wanted to achieve. The
root n ode, internal nodes and branches are inspired
by Akinato r, the Web Ge nie. Akinator is a game that
attempts to determine which character the player is
thinking of by asking a series of questions. The struc-
ture hidden under the user interface is a decision tree,
as in the case of NEViM.
Our model’s root and internal n odes are questions
which possess the ability to clearly distinguish diffe-
rent types of data visualizations. The branches ar e
’yes’ or ’no answers to those questions.
4.4.2 What Questions to Ask? (Establishing the
Internal Nodes and Root Node)
The biggest challenge in constructing questions for
our model was that they must be understandable for
non-experts, yet every que stion should get the user
closer to a data visualization recommendation. This
means tha t the subjects of th e questions must be fea-
tures that distinguish the different data visualizations
from each other. The key to solvin g this problem is to
base the questions on a c le ar classification hierarchy.
As far as we know, there is no one specific classifi-
cation hierarchy of da ta visualizations which would
be used globally. We researched different methods of
classification and combined them tog e ther to derive
a classification of our own. T his was a very time-
consumin g process. We went throu gh a total of 19
books (O’Neil and Schu tt, 2014; Kirk, 2016; Illin-
sky and Steele, 2011 ; Munzner and Maguire, 201 5;
Gnanamgari, 1981; Evergreen, 2 016; Yau, 2011; Yau,
2013; Heer et al., 2010; Hardin et al. , 2012; Yuk and
Diamond, 201 4; Brath and Jonker, 2015; Brner and
Polley, 2014; Telea, 2007; Brner, 2015; Ware, 2010 ;
Ware, 2 012; Stacey et al., 2015; Hinderman, 2015)
and f or each one, we construc te d a diagram showing
the cla ssification that was described in the text.
We examined the classification hierarchies from
books together with hierarchies available from web
resources and existing tools. We also ma de n ote of
any advantages or disadvantages of a specific data vi-
sualization, if they were listed. For example in several
sources (O’Ne il and Schutt, 2014; Kirk, 2016; Illin-
sky and Steele, 2011) the authors stated that the pie
chart is not suitable for when you have mo re than 7
parts. The a dvantages and disadvantages reflected fe-
atures of the d ata v isu alizations that could determine
whether they are candid a te s for recommendation or
not, so they are crucial for the final mode l.
We identified that there are two b a sic views that
the classifications incorporate. The fir st one is a view
from the perspective of the task the user wants to per-
form. The second is a view from the perspective of the
characteristics of the data the user has available. This
is in line with data characteristics and task oriented
recommendation systems (Ka ur and Owonibi, 2017).
We have identified a prominent issue in the c la s-
sification hierarchies: they mix different views into
one without making a clear distinction betwee n them.
To avoid this issue, we have selected the root node o f
our model to be a question which would distinguish
between two views. The first view is from a task-
based perspective and it uses the representationa l go a l
or users intentions behind visualizing the data to re-
commend a suitable visualization. The second view is
from a data- driven perspective, where a visualization
recommendation is made based on gathe ring informa-
tion abo ut the user’s data. The root node of NEViM
is a question asking ”Do you know what your main
task is?” If the user answers ”Yes”, he is taken in to
the task-based branch. If he answer s ”No”, he is taken
straight into the data characteristics-based branch.
Once we established the root node, we had to
come up with internal nodes. The inter nal nodes ar e
questions which possess the ab ility to clearly distin-
guish different types of data visu alizations. The sub-
ject of such a question must be something that we de-
fine as a distinguishing feature. Based on the findings
we ma de in previous paragraphs, we have established
a list of distinguishing features and their hierarchy.
Based on the disting uishing features, w e have co n-
structed questions that ask whether that feature is pre-
sent or not. You can see an example of such questions
in Figure 3.
4.4.3 What Data Visualizations to Include?
(Establishing the Leaf Nodes)
Once we had figured out our model’s ba se structure,
distinguishing features and questions, the challenge
was, which data visualizations to include. We knew
that we would not be able to cover all the 300 types
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
214
of data visualizations available (Bostock, 2017) in the
initial version of our model. We took a r ather quanti-
tative approach to the problem. We went through all
the different classification hierarchies we constructed
previously and extracted a list of the data visualizati-
ons that occur. We rem oved duplicates (different na-
mes for the same visualization, different layouts of the
same visualization) and we counted how many times
each data visualization occurred . The ones that occur-
red 5 times or more were included in our final model.
The final list contains 29 data visualizations and yo u
can see it below. Since one of our requirements f or the
final model is easy extensibility, we feel that 29 data
visualizations are appropriate for the initial model.
Table 1: Data visualizations included in NEViM.
Bar Chart Bubble Chart Cartogram
Choropleth Map Clustered Bar Connected Dot
Connection Map Density Plot Dot Map
Flow Map Heat Map Histogram
Line Chart Network Pie Chart
Proportional Map Radar Plot Scatter Plot
SPLOM Slope Graph Small Multiples
Stacked Area Stacked Bar Stacked Line
Table Timeline Treemap
Parallel Coordinates
4.4.4 Putting It All Together
We classified each of our leaf nodes (data visualizati-
ons) using the distingu ish ing features we constructed
previously. This revealed the path of internal nodes
and branches that leads to a certain leaf node. In other
words, it revealed which questions have to be answe-
red and how in order to get to a certain data visualiza-
tion.
We then combined all the classifications together
to construct the final model
1
. The model has 107 in-
ternal nodes and 105 leaf nodes. The model always
results in a recommendation. If no other suitable vi-
sualization is found, we recommend to use a table by
default. Tableau does this as well.
4.5 Testing the Model
4.5.1 Can the Model Compete with Existing
Solutions?
We carried out tests to determine whether our model
was able to compete with existing systems in terms of
similarity of solutions. We obtained 10 different test
data sets with various features (See Table 2). The data
1
The whole model as well as a prototype can be
viewed at a website dedicated to this research project:
http://www.datavisguide.com
sets were preprocessed to remove invalid entrie s and
to ensure that all the attributes w ere of the co rrect data
type.
For each data set, we formulated an example que-
stion that a potential user is a iming to answer. This
was done in order to determine which a ttributes of
the data would be used in the recommendation p ro-
cedure. M ost existing tools require the user to select
the specific attributes that they want to use for their
data visualization. By specifying these for each data
set we a ttempt to mimic this behavior. Table 2 shows
the data sets along with their descriptions.
We tested our m odel again st existing solutions
which are freely available: Tablea u (10.1.1), Watson
Analytics (version available in July 2017), Microsoft
Excel (15.28 Mac), Voyager (2) and Google Sheets
(version available in July 2017). For each system and
every data set, we a imed to achieve a recommenda-
tion for a data visualization that would answer the
question and incorporate all the specified a ttributes in
one graph as there is no possible way to answer the
question without incorpora ting th e specified attribu-
tes. Some systems solve more complex questions by
creating a series of different data visualizations, with
each visualization incorporating a different combina-
tion of attributes. We exclude d such solutions from
our test results because we feel that it is a workaround.
For Microsoft Excel and Google Sheets, th e recom-
mendation process results in several recommendati-
ons and the systems do no t rank them. For these cases
we recorded all valid recommendations.
Results
For data set 1 , all systems recommended a bar ch a rt.
Excel and Google Sheets also recommended a pie
chart. The r e commendations for data set 2 were either
line charts or bar charts. The specified question could
be answered by either of these. Watson Analytics was
not able to give a recommendation because it could
not recognize that the average price attribute was a
number. We have attempte d resolving this issue but
were not able to. For data set 3, the majority recom-
mendation was a clu ster ed bar chart, in line with the
recommendation mad e by NEViM. Data set 4 pro-
ved to be challenging for Voyager and Watson Ana-
lytics. Since the data was h ie rarchical and the ques-
tion was asking to see parts-of-whole, a suitable solu-
tion would be a tree m ap. A pie ch a rt shows parts-of-
whole, but does not indicate hierarchy. The question
asked for data set 5 could be answered using diffe-
rent types of da ta visualizations. Since it is asking to
analyze the correlation between 2 variables, a scatter
plot is a suitable solution. All systems recommended
it. Data set number 6 was an example of a social net-
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users
215
Table 2: Results of the competitiveness test.
Data
set
1
2
3
4
5
Description Records
Favourite subjects
within a class of
students
7
Average prices of
cigarettes over several
years
8
Percentage of men and
women in EU countries
for 2016
28
Causes of death in
Kenya in 2012
12
Daily ice cream sales
information with
temperature
30
Question Used attributes Excel
Google
Sheets
Tableau
What does the
composition of the data
look like?
subject, no. of
students
bar chart,
pie chart
bar chart,
pie chart
bar chart
What was the
development of the
cigarette price over the
years?
year, average price
line chart,
bar chart
line chart line chart
Which 5 countries have
the highest percentage
of females?
country, % of men,
% of women
clustered bar chart,
scatter plot,
stacked bar chart
clustered
bar chart
proportional
symbol map
How big of a part does
each cause take?
cause of death, no. of
deaths, % of total
none pie chart tree map
Are ice cream sales
related to the weather?
income, temperature
scatter plot,
clustered
bar chart, line
chart,
stacked bar chart
line chart,
scatter
plot,
clustered
bar chart
scatter plot
Voyager
bar chart
bar chart
scatter
plot
none
scatter
plot
Watson
Analytics
bar chart
none
clustered
bar chart
none
scatter plot
NEViM
bar chart
line chart
clustered
bar chart
tree map
scatter plot
6
7
8
9
10
Email communication
between researchers
working together
461
Finishing times of
runners in the 2014
Boston Marathon
32K
Records of UFO
sightings with detailed
information
80K
List of cars and their
parameters
393
Origins and
destinations of flights
within the US
4K
Which researcher is
connected to most
people?
sender, receiver none none none
Which finishing time
interval was the most
common?
finishing time
scatter plot,
line chart
line chart,
histogram
histogram
Are there any clusters
of locations where
UFOs have been seen
more often?
latitude, longitude none none none
Are there any
relationships between
the different
parameters?
miles per gallon, no.
of cylinders,
displacement,
horsepower, weight,
acceleration, year
stacked
line chart
none none
Which city has the
most ingoing and
outgoing flights?
flight origin, flight
destination
none none
proportional
symbol map
scatter
plot
none
none
none
none
none
histogram
none
none
none
network
histogram
dot map
parallel
coordinates
connection
map
work, thus the most suitable visualization would be a
network. However, the answer to the specified ques-
tion could also be answered with a scatter plot as sug-
gested by Voyager. This is because networks can also
be represented as ad ja c ency matrices and the scatter
plot generated by Voyager is essentially an adjacency
matrix. Data set 7 and its question we re aimed at
visualizing distributions. Distributions c a n be visu-
alized, among others with histograms, scatter plots
and line charts. Data set 8 was an exam ple of spa-
tial data. Spatial data is best visualized th rough maps.
Tableau offers map visualizations but we suspec t that
it cannot plot on the map according to latitude and
longitude coordinates. Watson Analytics and Google
Sheets have the same issue. Microsoft Excel and Voy-
ager do not support map s at all. In Data set 9 the ans-
wer to the question was revealed through compar ing
7 attributes. This meant that the visualization has to
support 7 different variables. Both stacked line chart
and parallel coordinates are valid solutions. The fi-
nal data set 10 was again spatial. This time it could
be solved through plotting on a map but also by analy-
zing the distribution of the data set. Both p roportional
symbol map and connectio n map (as a flight implies
a connection between two cities) are valid so lutions.
Overall, we can observe that NEViM provided
usable solutions in all cases. The users have several
paths that they ca n take through NEViM to get to a re -
commendation, depending on what information they
know about the ir data o r their task. NEViM has an
advantage that it is not limited by implementation.
Since two of our data sets were aimed at spatial data
visualization (9 and 10) and one at network data vi-
sualization (6), some systems were not able to make
recommendations simply because they do not support
such visualization types. Furthermore, NEViM inclu-
des more types of visualizations than any of the cur-
rent systems, which results in recommendations for
specialty visualizations that can be more suitable for
a certain task. Another advantage is that it always re-
sults in o nly one recommenda tion, unlike Microsoft
Excel or Google Sh eets, where the user has to choose
which one out of the set of recommend ations to use.
According to our survey, the most used visualiza tion
tool which incorporates a recommender system is Ta-
bleau (28% of non-exp ert respondents). From the re-
sult table, we can see that in 5 out of 7 valid cases,
NEViM made the same recommendation as Tableau.
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
216
Furthermore, in data set 3 Tableau also made a recom-
mendation for a Clustered Bar Ch a rt, like NEViM did,
but it was not the resulting recom mendation. One of
the attributes was the name of a country, so Tableau
evaluated the data as spatial. We have noticed that
whenever there is a geographical attribute, Tableau
prefers to recommend maps, even though they might
not be the most suitable solu tion.
4.5.2 Adding a New Data Visualization
We demonstrate that our model is easily extensible
by showing the pro cess o f adding a new data visua-
lization type - a Sankey diagr a m. Sankey diagrams
are specific types of flow diagrams and they display
quantities in proportion to one another. An exam ple
of a Sankey diagram can be seen in Figu re 2 .
Figure 2: Example of a Sankey diagram showing the distri-
bution of energy in a filament lamp (BBC, 2016).
We look into the classifications that we already
have and search for the most similar one. We find out
that the Tree Map has the same classification. So we
need to find a distinguishing feature between a Tree
Map and a Sankey diagram . That feature is, that a
Sankey diagr am shows flow. We search through the
model and find occurrences of a Tree Map. We then
add a qu e stio n asking ”Do you want to show flow?”.
If the user answers ”Ye s”, he g e ts a recommendation
for a Sankey diagram. If he answers ”No” he gets a
Tree Map. Figure 3 shows the two paths that a user of
NEViM can take to get to the Sankey diagram.
5 DISCUSSION & CONCLUSIONS
We mana ged to build a model for a data visualiza-
tion recommender system suited to non-experts cal-
led NEViM. Through testing, we have ma naged to
show that the resulting recommendations are similar
or identical to the ones ge nerated by existing soluti-
ons. Based on a review of existing work and a ex-
plorator y survey among users, we have put togeth er
requirements. This is a sho rt evaluation of how NE-
ViM managed to fulfill these:
1. Simplicity - Thanks to its question -based struc-
ture, usin g the model is simple. T he user only has
to answer yes or no questions. The basic structur e
is very straightforward.
2. Clarity - The result of our recommendation sy-
stem is a single data visualization, making it very
clear. We believe that n on-expert users need a
clear answer to their visu alization pro blem. If
they are given a choice between two or more vi-
sualizations in the en d, we believe that we have
failed at the task of recommen ding them the most
suitable one. We have narrowed their choices, but
still have not provided a clear answer. However,
this decision seems to be a c ontroversial one, so
it definitely needs to be validated through a user
study (See Section 6. ) In the case that none of
the data visualizations within the model are deter-
mined as suitable, the model still makes a recom-
mendation to visualize using a table.
3. Versatility - NEViM combines two d ifferent ty-
pes of data visualization rec ommend ation systems
as defined in (Kaur a nd Owonibi, 2017): task-
oriented and da ta character istics-oriented. These
two types are distinguished by two different star-
ting points within our mod el. Thanks to its base
structure the model can be easily implemen te d in
various different programming languag es a nd en-
vironm ents.
4. Extensibility - To illustrate the extensibility of the
model, we have added the Sankey diagram visua-
lization. Th is proved to be a doable task.
5. Education - This requirement has no t been met
yet. For suggestions on how we mean to fulfill it,
see Section 6.
6. Transparency - The traversal through our model
is logical enough that it is clear why a certain type
of data visualization was recommende d.
7. Self-learning - Our model is machine learn ing
friendly and techniques can be applied for it to be
able to self-learn. See our Section 6.
8. Competitiveness - Through testing we have pro-
ved that our model produces recommendations si-
milar or identical to existing solutions. It provided
suitable solutions for all cases tested, unlike exis-
ting solutions.
A possible disadvantage of NEViM could be tha t the
user has to either know what their main task is, or
know what type of data they have. The question is,
whether non -expert users will be able to determine
this. We believe that this could be fixed through user
testing to validate the overall structure of the mode l
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users
217
Is your main
task to compare?
Do you know what
your main task is?
Is your main task
to analyse a specific
data feature?
Do you want to
show flow?
Do you want
to compare
proportions?
Do you want to
show hierarchy?
Sankey
Diagram
Do you want to
compare proportions
over time?
Yes No Yes
Yes No Yes
Yes
Do you know what
your main task is?
Is your data
statistical?
Do you want
to compare?
Do you want
to compare
quantities?
Do you want
to compare
proportions?
Do you want to
compare proportions
over time?
Do you want to
show flow?
Do you want to
show hierarchy?
Sankey
Diagram
Yes YesNoYesNoYesYesNo
Figure 3: Two possible paths to reach a Sankey diagram (left: t ask-based, right: data-based).
as well as the quality of the q uestions. The questi-
ons could be checked by a linguistics expert to see
whether the wording is suitable and does not lead to
possible ambiguous interpretations.
Another disadvantage might lie in the fact that
since we use data science termin ology in our ques-
tions, we risk that non-experts might not be familiar
with it and might not be a ble to answer the question.
A solution could be to clarify the terms using a dicti-
onary defin ition, which could pop up when the user
hovers over the unfamiliar term. The solution is more
part of the implementation phase, not the theoretical
phase which we discuss he re.
A difficulty in the usability of our model might be
that the traversal thro ugh it is quite lengthy. This is
due to the chosen question-b a sed approach. A poten-
tial fix for this could be to present some parts of the
model in the form of a multiple choice question. This
way, the user could see beforehand what other options
are available and might find a more suitable task they
want to pe rform. This is once again a problem that
could be fixed easier in the implementation pha se.
We have questioned whether the choice to rec om-
mend a table when no other suitable visualization is
found is the correct one. There is an ongoing de-
bate about when it is best to not visualize things, as
discussed by Stephanie Evergreen (Evergreen, 2016).
Within the implementation phase, d ata could be col-
lected to find out in how many cases the Table option
is reached, to identify whether it is necessary to furt-
her address this issue.
6 FUTURE WORK
We have proved that ther e is definitely a place for our
model in the data science world. The logical next step
would be to perform more tests with more data sets
and make improvements to the mo del. Then th e mo-
del could be tested with non-expert users. Such a user
study could evaluate the usability of the model as well
as its contribution.
The mode l could be implemented as a web appli-
cation and users could rate the resu lting recommenda-
tions, suggest new paths through the model or request
new visualization types to be included. This would
also validate the question paths that we have desig-
ned. The final re c ommendation could be enhanced
with useful information ab out the data visualization
type, tips on how to construct it, which tools to use
and examples of already made instances. This would
transform the model into a very useful educative tool
and fu lfill the Education requirement that we have set.
Another possible extension to the model could be
to add another view which would incorporate infor-
mation about the domain that the user’s data comes
from. There are data visu alizations that are more sui-
ted for a specific data domain than others. For exam-
ple, the area of economics has special types of data vi-
sualizations that are more suited to exposing different
econom ic indicators. This would make the m odel part
of the domain knowledge oriented data v isu alization
systems recommender systems category according to
(Kaur and Owonibi, 2017).
Thanks to its structure, NEViM is machine lear-
ning friendly. For example, n eural networks could
be used to make the model self-lea rning and self-
improving.
We could introduce different features that could
influence the visualization ranking - e.g. perceptual
qualities of different data visualization types. Now
that we have established a successful base, the possi-
bilities for further development are endless.
ACKNOWLEDGEMENTS
Research suppor te d by SGS grant No. SGS1 7/210/
OHK3/3T/18 and GACR grant No. GA18-18080S.
REFERENCES
BBC (2016). Heat transfer and efficiency.
Bertin, J. (1983). Semiology of graphics: diagrams, net-
works, maps.
Bostock, M. (2017). Data-driven documents.
Brath, R. and Jonker, D. (2015). Graph analysis and visu-
alization: discovering business opportunity in linked
data. John Wiley & Sons, Hoboken, NJ.
Brner, K. (2015). Atlas of knowledge: Anyone can map.
MIT Press, Cambridge, MA.
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
218
Brner, K. and Polley, D. E. (2014). Visual insights: A practi-
cal guide to making sense of data. MIT Press, Cam-
bridge, MA.
Evergreen, S. D. (2016). Effective data visualization: The
right chart for your data. SAGE Publications, T hou-
sand Oaks, CA.
Gnanamgari, S. (1981). Information presentation through
default displays. PhD thesis, Univ. of Pennsylvania,
Philadelphia, PA.
Google (2017). Chart and graph types.
Gotz, D. and Wen, Z. (2009). Behavior-driven visualization
recommendation. In Proceedings of the 14th interna-
tional conference on Intelligent user interfaces, New
York, NY.
Hanrahan, P. (2006). Vizql: a language for query, analysis
and visualization. In Proceedings of the 2006 ACM
SIGMOD international conference on Management of
data, New York, NY.
Hardin, M. et al. (2012). Which chart or graph is right for
you?. tell impactful stories with data. Tableau Soft-
ware.
Heer, J. et al. ( 2010). A tour through the visualization zoo.
Queue, 8.5.
Hinderman, B. (2015). Building responsive data visualiza-
tion for the web. John Wiley & Sons, Hoboken, NJ.
IBM (2017). Smart data analysis and visualization.
Illinsky, N. and Steele, J. (2011). Designing data visu-
alizations: representing informational relationships.
O’Reilly Media, Sebastopol, CA.
Kaur, P. and Owonibi, M. (2017). A review on visualization
recommendation strategies. In Proceedings of the 12th
International Joint Conference on Computer Vision,
Imaging and Computer Graphics T heory and Appli-
cations, pages 266–273, Porto, Portugal.
Kirk, A. (2016). D ata visualization: A handbook for data
driven design. SAGE, London,UK.
Mackinglay, J. (1986). Automating the design of graphical
presentations of relational information. ACM Tran-
sactions on Graphics, 5.2:110–141.
Mackinlay, J. et al. (2007). Show me: Automatic presenta-
tion for visual analysis. IEEE Transactions on Visua-
lization and Computer Graphics, 13.6.
Microsoft (2017). Available chart types in office.
Munzner, T. and Maguire, E. (2015). Visualization analysis
and design. CRC Press, Boca Raton, FL.
O’Neil, C. and Schutt, R. (2014). Doing Data Science:
Straight Talk From The Frontline. OReilly Media, Se-
bastopol,CA.
Roth, S. F. and Mattis, J. (1990). Data characterization for
intelligent graphics presentation. SIGCHI Conference
on Human Factors in Computing Systems.
Satyanarayan, A . et al. (2017). Vega-lite: A grammar of
interactive graphics. IEEE Transactions on Visualiza-
tion and Computer Graphics, 23.1:341–350.
Stacey, M. et al. (2015). Visual intelligence: Microsoft tools
and techniques for visualizing data. John Wiley &
Sons, Hoboken, NJ.
Stolte, C. et al. (2002). Polaris: A system for query, ana-
lysis, and visualization of multidimensional relational
databases. IEEE Transactions on Visualization and
Computer Graphics, 8.1:52–65.
Telea, A. C. (2007). Data visualization: principles and
practice. CRC Pr ess, Boca Raton, FL.
Tukey, J. W. ( 1970). Exploratory Data Analysis. Addison-
Wesley, Reading,MA.
Vartak, M. et al. (2015). Seedb: supporting visual analytics
with data-driven recommendations. VLDB.
Viegas, F. et al. (2007). Manyeyes: a site for visualization
at internet scale. IEEE Transactions on Visualization
and Computer Graphics, 13.6.
Ware, C. (2010). Visual thinking: For design. Morgan Kauf-
mann, Burlington, MA.
Ware, C. (2012). Information visualization: perception for
design. Elsevier, Amsterdam, NL.
Wehrend, S. and Lewis, C. (1990). A problem-oriented
classification of visualization techniques. In Procee-
dings of the 1st Conference on Visualization’90, Los
Alamitos, CA.
Wongsuphasawat, K. (2017). Vega compass.
Wongsuphasawat, K. et al. (2016). Voyager: Exploratory
analysis via faceted browsing of visualization recom-
mendations. IEEE Transactions on Visualization and
Computer Graphics, 22.1:649–658.
Yau, N. (2011). Visualize This: The FlowingData Guide to
Design, Visualization, and Statistics. John Wiley and
Sons, Hoboken, NJ.
Yau, N. (2013). Data points: Visualization that means so-
mething. John Wiley & Sons, Hoboken, NJ.
Yuk, M. and Diamond, S. (2014). Data visualization for
dummies. John Wiley & Sons, Hoboken, NJ.
Zhou, M. X. et al. (2002). Building a visual database for
example-based graphics generation. INFOVIS 2002
IEEE Symposium.
Zhou, M. X. and Feiner, S. K. (1998). Visual task charac-
terization for automated visual discourse synthesis. In
Proceedings of the SIGCHI conference on Human fac-
tors i n computing systems, Boston, MA.
Knowledge at First Glance: A Model for a Data Visualization Recommender System Suited for Non-expert Users
219