Evaluating Differences in Insights from Interactive Dimensionality

Reduction Visualizations Through Complexity and Vocabulary

Mia Taylor

, Lata Kodali

, Leanna House

and Chris North

Department of Computer Science, Virginia Tech, U.S.A.

Department of Statistics, Virginia Tech, U.S.A.

Keywords:

Visualization, Dimensionality Reduction, Logistic Regression, Applied Natural Language Processing.

Abstract:

The software, Andromeda, enables users to explore high-dimensional data using the dimensionality reduction

algorithm Weighted Multidimensional Scaling (WMDS). How data are projected in WMDS is determined by

weights assigned to variables, and with

Andromeda

, the weights are set in response to user interactions. This

work evaluates the impact of such interactions on student insight generation via a large-scale study implemented

in a university introductory statistics course. Insights are analyzed using complexity metrics. This analysis is

extended to compare insight vocabulary to gain an understanding of differences in terminology. Both analyses

are conducted using the same semi-automated method that applies basic natural language processing techniques

and logistic regression modeling. Results show that speciﬁc user interactions correlate to differences in the

dimensionality and cardinality of insights. Overall, these results suggest that the interactions available to users

impact their insight generation and therefore impact their learning and analysis process.

1 INTRODUCTION

Visualizations are typically evaluated via task com-

pletion or insight generation. For task-based evalua-

tions, researchers ask analysts to complete a task where

metrics such as analysts’ accuracy and completion

time are measured. For insight-based evaluations, re-

searchers analyze participant-generated insights about

data. While asking analysts to complete a task seems

like a simpler method of evaluating a visualization,

some argue that because visualizations are created to

generate insights into data, then they should be eval-

uated in a similar manner (Card et al., 1999; North,

2006).

Deﬁning “insight” and its “complexity” is beyond

the scope of this paper. Thus, we borrow from previous

work. An insight is deﬁned as “as an individual obser-

vation about the data by the participant” (Saraiya et al.,

2005). We assess the complexity of insights using

three metrics: dimensionality, cardinality, and relation-

ship cardinality (Self et al., 2017; Self et al., 2018).

The dimensionality is the number of variables or at-

tributes explicitly mentioned. The cardinality is the

number of observations explicitly mentioned. Lastly,

relationship cardinality is the number of comparisons

made between variables and/or observations.

Insight complexity metrics are usually calculated

by hand. This imposes analytic limitations. First,

manual calculations require intensive labor. Second,

researcher annotation can be subjective. Third, the

complexity metrics do not describe the difference in

insight vocabulary between visualizations. The in-

sight analysis method in this work builds upon current

insight-based evaluation methods by automatically cal-

culating insight dimensionality, cardinality, and rela-

tionship cardinality. Using applied natural language

processing and logistic regression statistical modeling

for complexity metrics extends naturally to conducting

a keyword analysis on the insights.

Our method for analysis is applied to a large-scale

case study on the insights that students generate us-

ing an interactive dimensionality reduction application

called

Andromeda

. This work addresses the following

research questions and contributions.

Research Questions

How does insight complexity relate to interaction

types available within Andromeda?

How does insight vocabulary relate to interaction

types available within Andromeda?

What does the analysis of insight complexity and

vocabulary portray about student learning with

Andromeda?

158

Taylor, M., Kodali, L., House, L. and North, C.

Evaluating Differences in Insights from Interactive Dimensionality Reduction Visualizations Through Complexity and Vocabular y.

DOI: 10.5220/0011663500003417

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 3: IVAPP, pages

158-165

ISBN: 978-989-758-634-7; ISSN: 2184-4321

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

Contributions

A novel methodology for insight evaluation using

natural language processing (NLP) techniques.

An evaluation of the relationship between visual

analytics interaction types on insight generation

using the above novel methodology.

2 RELATED WORK

2.1 Insight-Based Visualization

Evaluation

A common way to evaluate a visualization is through

the insights it helps generate. North (2006) postu-

lates that “the purpose of visualization is insight. The

purpose of visualization evaluation is to determine

to what degree visualizations achieve this purpose”

(North, 2006). There exist multiple methodologies to

perform insight-based visualization evaluation.

A popular insight evaluation method commonly

used by visualization researchers is the Saraiya et

al. (2005) characterization which measures an insight’s

degree of directness, correctness, breadth, and depth

(Saraiya et al., 2005). A similar characterization

by North (2006) measures domain value, complex-

ity, depth, subjectivity, unexpectedness, and relevance

(North, 2006). Visualization researchers may apply

these characterizations by manually assigning char-

acteristic values to each insight and comparing these

values within their data analysis. Multiple works con-

tinue to apply and adapt these characterizations. The

O’Brien et al. (2011) characterization also counts met-

rics such as the insights per minute(O’Brien et al.,

2011). The Gomez et al. (2014) characterization

compares insight results with a task-based evaluation

(Gomez et al., 2014). Lastly, the He et al. (2021) char-

acterization also analyzes interaction logs and insight

quality (He et al., 2021). While these works showcase

the effectiveness of applying the Saraiya et al. (2005)

insight characterization, they are dependent on man-

ual insight characterization and do not describe the

difference in the language used in insights between

visualizations. This case study presents work that semi-

automatically analyzes insights by complexity metrics

and vocabulary.

2.2 Andromeda

Andromeda

is an interactive visualization and data

analysis tool that was originally designed to enable

analysts of all skill levels to explore high-dimensional

data (Self et al., 2016; Self et al., 2015). The visualiza-

tion relies on a dimensionality reduction algorithm.

Dimensionality reduction algorithms take in high-

dimensional data as input and outputs low-dimensional

data that is representative of the input data. The low-

dimensional data is usually represented in 2- or 3-

dimensions for visualization purposes.

Andromeda

speciﬁcally uses Weighted Multidimensional Scaling

(WMDS) (Kruskal and Wish, 1978). WMDS is a di-

mensionality reduction algorithm that associates each

dimension in the data with a weight that represents the

dimension’s relative importance in the visualization.

With

Andromeda

, users can explore the dimension

(variable) weights and the projections to better under-

stand the high-dimensional data.

Andromeda

is often studied using a dataset describ-

ing animals because analysts do not need any domain

knowledge to understand the dataset. Any future refer-

ence to the Animals dataset is speciﬁcally referring to

the dataset created by Xian et al. called Animals with

Attributes 2 (AWA2) (Xian et al., 2019). The follow-

ing section describes the interaction types available

Andromeda

using a reduced version of the Animals

dataset as an example. There are three types of inter-

actions in

Andromeda

that enable analysts to explore

high-dimensional data: surface-level interaction, para-

metric interaction, and observation-level interaction.

2.2.1 Surface-Level Interaction

Surface-level interaction (

SLI

) allows users to high-

light one or more data points by clicking or hover-

ing. This interaction enables users to view the data

points’ values without altering the projection or vari-

able weights. Thus,

SLI

does not interact with WMDS.

An example is shown in ﬁg. 1.

Figure 1: Surface-level interaction (

SLI

) in

Andromeda

This is the initial projection with all variables weighted

equally. Applying

SLI

, the

Elephant

point was clicked,

and the cursor is hovering over

Squirrel

. The attribute, or

feature, values of Elephant and Squirrel are shown on the

right-hand side in orange and yellow, respectively.

SLI

does

not affect the projection nor the variable weights.

Evaluating Differences in Insights from Interactive Dimensionality Reduction Visualizations Through Complexity and Vocabulary

159

Figure 1 and subsequent

Andromeda

ﬁgures are

from the current web-version of

Andromeda

. The

study described in this work used an older version

Andromeda

that offered identical functionality with

negligible interface differences.

2.2.2 Parametric Interaction

The second interaction type, Parametric Interaction

(

), allows users to interact with WMDS by changing

variable weights that are represented by sliders. This

assigns different levels of importance to the variables

such that a variable with a greater weight inﬂuences

the layout more than a variable with a lesser weight.

Figure 2 shows an example of PI.

2.2.3 Observation-Level Interaction

The ﬁnal type of interaction is Observation-Level In-

teraction (

OLI

) (Endert et al., 2011).

OLI

allows users

to reposition data points in the layout. This indirectly

communicates variable weight changes via inverse

WMDS (House et al., 2015). After dragging points to

different locations on the projection and clicking the

“Update Layout” button, Andromeda solves for the op-

timal weights that preserve the user-deﬁned projection.

Then, Andromeda updates its display with the new

weights. Figure 3 shows an example usage of OLI.

2.2.4 Studies in Education

Andromeda

was designed to enable data analysts of

all skill levels to explore high-dimensional data (Self

et al., 2015).

Andromeda

is available publicly as a

web application (Andromeda Website, ).

Figure 2: Parametric interaction (

) in Andromeda. The

sliders for

Size

and

Speed

were dragged to the right to

increase their weight. The layout differs from the initial

layout in ﬁg. 1 that relied on equal weights for all variables.

Hovering over the variable

Size

changes the size of the ani-

mal circles to be proportional with the animal’s

Size

value.

Because

Size

and

Speed

have higher weights, animals with

similar

Size

and

Speed

values tend to be projected near

each other such as the Siamese Cat and Squirrel.

Figure 3: Observation-level interaction (

OLI

) in

Andromeda

The

Squirrel

and

Siamese Cat

were dragged close to-

gether while the

Elephant

and

Blue Whale

were dragged

close together, but separate from the

Squirrel

and

Siamese Cat

. After clicking the update layout button, the

layout changes by learning variable weights that describe

how the dragged points are similar or dissimilar. These

learned weights are applied to the entire projection. As

shown by the high

Furry

variable weight,

Furry

best de-

scribes how the dragged points relate to each other.

Previous educational studies (Self et al., 2014;

Zeitz et al., 2018) analyzed how college students

performed on a series of data analysis assignments

by manually characterizing insight diversity (Amar

et al., 2005) and complexity. Insight complexity is

measured by dimensionality, cardinality, and relation-

ship cardinality. Results showed that students tended

to think in low dimensions by default. As a result

of using

Andromeda

, students generated insights that

were higher in dimensionality and more complex. The

work presented in this study uses the same measures

of insight complexity using manual calcuations. Semi-

automated metric calculations are described in subsec-

tion 3.2.

3 METHODS

3.1 Experiment Design

The large-scale classroom experiment was conducted

in the undergraduate introductory statistics course at a

university in Spring 2017. The course had a lecture por-

tion and an additional “recitation” section which was

a 50-minute small group section per week. Recitation

sections were on Mondays, Tuesdays, Wednesdays,

or Thursdays. For the study, students used the web-

based version of

Andromeda

. A total of 152 students

participated.

During the lecture portion of the course, all stu-

dents were taught about Weighted Multidimensional

Scaling (WMDS) and Andromeda. Students were able

IVAPP 2023 - 14th International Conference on Information Visualization Theory and Applications

160

to familiarize themselves with

Andromeda

using the

Animals dataset.

Four versions (one version per recitation) of

Andromeda

were given to students during recitation.

Each student was enrolled in a single recitation section.

Data were collected from the students during recita-

tion. The list below describes the different versions of

Andromeda.

1. NONE

: Only has access to surface-level interactions.

Essentially, static WMDS.

2. PI

: Has access to parametric and surface-level in-

teractions.

3. OLI

: Has access to observation-level and surface-

level interactions.

4. BOTH

: Has access to parametric, observation-level,

and surface-level interactions.

These four versions of

Andromeda

were randomly as-

signed to entire recitation groups, with Monday using

(42 students), Tuesday using

NONE

(40 students),

Wednesday using

OLI

(40 students), and Thursday

using BOTH (30 students).

Students completed surveys throughout recitation

where they optionally consented to have their sub-

mission data collected for this study. The data were

collected with approval under IRB #21-911. Students

were asked to write down three insights about the Ani-

mals dataset before and after using their assigned ver-

sion of

Andromeda

. Insights generated before using

their version are called pre-recitation insights, while

insights generated afterward are called post-recitation

insights.

3.2 Insight Analysis Method

The same insight analysis method is used to deter-

mine differences in insight complexity and vocabulary.

The difference in insight complexity is calculated by

measuring differences in dimensionality, cardinality,

and relationship cardinality. The difference in insight

vocabulary is calculated by measuring differences in

word count. See ﬁgs. 4 to 6 to follow the cleaning

and vectorization of a single insight. The complexity

metric names have been shortened for presentation.

DIM

CARD

, and

REL CARD

are short for dimensionality,

cardinality, and relationship cardinality, respectively.

3.2.1 Clean Insights

The following text cleaning and processing steps are

applied in the context of this study but can be altered

for other datasets and experimental setups as appropri-

ate.

Combine Three Insights. Concatenate each set of

three insights generated by students into a single

response.

Remove Stop Words. Remove unimportant words

such as “the”, “and”, etc. These words are not

important for analysis. The NLTK

pre-made stop

word list was used.

Apply Lemmatization. Lemmatization is the natu-

ral language process of grouping forms of a word

into a single word. For example, “changing” and

“changed” are changed to their base form “change”.

Lemmatization was done with the NLTK

lemma-

tizer and manual lemmatization was done for any

word forms that the NLTK lemmatizer missed.

Combine

feature

attribute

variable

into a

Single Keyword. For the purposes of this study,

these words have the same meaning and are used

interchangeably by students. These words are con-

verted into variable.

3.2.2 Vectorize Insights

The cleaned insights are vectorized by calculating the

complexity metrics, counting the number of occur-

rences of each word, and normalizing the values. The

values are normalized so that the output of the logistic

regression models, as described in the next step, have

comparable magnitudes. The vectorization process

can be easily altered for its usage context. Figure 6

shows an example vectorized insights.

The insight metrics are calculated by replacing

instances of observations, variables, and compara-

tive words with their metric name. For the animal

dataset, convert all instances of variable names (such

Smelly

and

Size

) into the keyword

DIM

. Convert

all instances of observation names (such as

Giraffe

and

Gorilla

) into the keyword

CARD

. This does not

include group words like

Mammals

. Lastly, any in-

stances of comparative words or phrases like

Similar

or Equal are converted into the keyword REL CARD.

3.2.3 Develop Logistic Regression Model

Consider the following notation to further develop the

methods. Let

be insight

belonging to the set of all

insights

that consists of insights collected from two

insight groups. Let

∈ Y

be a binary indicator for

whether insight

belongs to the ﬁrst or second insight

group to compare. Let

be the set of all covariates

to be compared. In this case,

consists of the three

“As expected, the bobcat is

larger than the spider monkey.”

Figure 4: An single, example insight referencing the Animals

dataset.

Natural Language Toolkit: www.nltk.org

Evaluating Differences in Insights from Interactive Dimensionality Reduction Visualizations Through Complexity and Vocabulary

161

expect

CARD

large

REL CARD CARD

Figure 5: The cleaned version of the insight from ﬁg. 4.

Complexity Metrics Vocabulary

DIM CARD REL CARD expect large ...

0 2 1 1 1 ...

Figure 6: The vectorized version of the insight excerpt from

ﬁg. 4. Because the insight mentioned two animals, its car-

dinality score is two. The values are not normalized. The

ellipsis (...) represents words that would be present in other

insights if this insight were used in a study.

complexity metrics and all unique words present across

all insights. After vectorizing

, let

represent the

value of covariate c in insight i.

A logistic regression model is used as a proba-

bilistic, binary classiﬁer based on observed covariates.

Here, a logistic regression model is used to classify

with probability the group from which insights are col-

lected given metric values and word counts; i.e., to

classify y

from x



1 + e

−β



−1

(1)

where

represents the probability

= 1

given

By ﬁtting this model, the relationship between the

covariate

and group assignments is learned. The co-

efﬁcient,

, reﬂects the direction and strength of this

relationship. A positive

indicates that the proba-

bility

increases with value

. A large

indicates

a large change in probability. Additionally, the

magnitudes are comparable across models because the

vectorized insights are normalized.

3.2.4 Test Signiﬁcance of Models

Hypothesis testing is used to determine whether

statistically signiﬁcant. A two-sided t-test with a type

I error,

, set to 0.1 is used. This value is used, as

opposed to 0.05, because of the low cost of type I er-

rors (false positives) in the study. The null hypothesis

is that there is no relationship (

= 0)

between

element

and response

. When the p-value of the test

is less than

, the null hypothesis is rejected and it is

claimed that there is a relationship between

and

which supports that the value

is a signiﬁcant word or

metric in the comparison.

Because a logistic regression model is ﬁtted for

each word, there must be a control for multiple testing.

To do so, the Benjamini-Hochberg procedure (Ben-

jamini and Hochberg, 1995) is used. This is arguably

a liberal correction method that is appropriate in the

case of visualization studies where the cost of type I

errors is low. Applying this correction ensures a type I

error of α = 0.1 across all tests.

3.2.5 Identify Signiﬁcant Keywords

After ﬁtting a logistic regression model for each co-

variate in

and applying the Benjamini-Hochberg pro-

cedure, the values with

coefﬁcients that are statisti-

cally signiﬁcant are identiﬁed. Thus, the ﬁnal output

is a list of keywords and/or metrics that, univariately,

explain the signiﬁcant variance between the set of in-

sights X in the comparison.

3.2.6 Application

Our method described in this subsection is used to

identify differences in insight complexity and vocab-

ulary based on binary-labeled data from the study de-

scribed in subsection 3.1. This case study presents

the change in pre- to post-recitation insights for each

version of

Andromeda

as well as the difference in in-

sights between

and

OLI Andromeda

. This compar-

ison was chosen because these versions of represent

each WMDS interaction as opposed to

NONE

(which

is essentially static WMDS) and

BOTH

which contains

both WMDS interactions. For identifying changes

in pre- to post-recitation insights, the insights are la-

beled

y = 1

for pre-recitation insights and

y = 0

for

post-recitation insights. For identifying differences in

insights between interaction types, the post-recitation

insights for each interaction type are labeled

y = 1

for

the ﬁrst interaction type and y = 0 for the other.

4 RESULTS

4.1 Changes in Insight Complexity

The results of changes in insight complexity are shown

in table 2. The dimensionality of insights consistently

increases across versions of

Andromeda

. This means

that regardless of the interaction types available to

students, they produced insights that described signiﬁ-

cantly more dimensions after recitation. Unexpectedly,

the results are not the same for cardinality and rela-

tionship cardinality. The cardinality of insights did not

signiﬁcantly increase or decrease throughout recita-

tion. Relationship cardinality signiﬁcantly decreases

for insights generated using SLI.

4.2 Changes in Insight Vocabulary

The results of changes in insight vocabulary are

shown in table 3. Insights generated using

had

the most keywords identiﬁed.

Mammal

Far

, and

Would

increased the probability that insights were

pre-recitation while

Variable

Weight

Increase

IVAPP 2023 - 14th International Conference on Information Visualization Theory and Applications

162

and

Lot

increased the probability that insights were

post-recitation. For

NONE

-supported insights, the term

Even

decreases the probability that insights were post-

recitation while

Much

increases it. Lastly, for

OLI

- and

BOTH

-Andromeda, the terms

Variable

and

Weight

resulted in an increase in the probability that insights

were generated post-recitation.

BOTH

also identiﬁed

Change

as having this result. For insights generated us-

ing

OLI

, and

BOTH

, using

Variable

and

Weight

results in an increased probability that these insights

were generated post-recitation.

4.3 Differences in Parametric and

Observation-Level Interaction

The results on differences in insight vocabulary be-

tween

and

OLI Andromeda

are shown in table 4.

Only post-recitation insights were used for this com-

parison because the intention was to uncover vocabu-

lary differences between insights generated using

and

OLI

. The goal of performing this comparison is

to discover whether students tend to generate different

insights based on access to a single WMDS interac-

tion in

Andromeda

. Insights generated using

are

strongly associated with the dimensionality metric and

the words

Weight

Variable

Increase

, and

One

On the other hand, insights generated using

OLI

tend

to use terms associated with the cardinality metric and

the words Similar, Away, and Far.

Table 1: Descriptive statistics of pre- and post-recitation in-

sight complexity metrics across all versions of

Andromeda

Statistic Dimensionality Cardinality Rel. Cardinality

Pre Post Pre Post Pre Post

mean 0.85 2.06 4.84 4.76 3.76 3.48

std 1.53 2.88 3.31 3.17 1.93 2.09

min

0 0 0 0 0 0

max 11 34 19 13 12 11

Table 2: Complexity metric differences between insights gen-

erated pre- and post-recitation with a version of

Andromeda

Dimensionality

signiﬁcantly increases across all versions

of Andromeda.

Groups Term Beta Std. Err. Prob. ↑ Signif.

NONE DIMENSIONALITY -1.0116 0.522 Post *

CARDINALITY 0.0083 0.198 Pre

RELATIONSHIP CARDINALITY 0.344 0.206 Pre *

PI DIMENSIONALITY -1.3018 0.282 Post ***

CARDINALITY 0.268 0.184 Pre

RELATIONSHIP CARDINALITY 0.003 0.181 Pre

OLI DIMENSIONALITY -0.445 0.227 Post **

CARDINALITY -0.272 0.193 Post

RELATIONSHIP CARDINALITY 0.179 0.194 Pre

BOTH DIMENSIONALITY -0.6936 0.267 Post ***

CARDINALITY 0.090 0.210 Pre

RELATIONSHIP CARDINALITY 0.042 0.210 Pre

Table 3: Signiﬁcant keyword differences between in-

sights generated pre- and post-recitation with a version of

Andromeda

Even

is a signiﬁcant term when comparing

NONE

insights. An insight is more likely to belong to (Prob.

↑) the Pre-recitation insights if it uses the term Even more.

Groups Term Beta Std. Err. Prob. ↑ Signif.

NONE Even 0.626 0.307 Pre **

Much -0.590 0.353 Post *

PI Mammal 0.441 0.228 Pre *

Far 0.503 0.269 Pre *

Would 0.563 0.309 Pre *

Variable -0.919 0.265 Post ***

Weight -0.484 0.209 Post **

Lot -0.430 0.253 Post *

Increase -0.423 0.242 Post *

OLI Variable -0.710 0.279 Post **

Weight -1.182 0.546 Post **

BOTH Variable -0.649 0.247 Post ***

Weight -0.610 0.253 Post **

Change -0.726 0.392 Post *

Table 4: Signiﬁcant metric and keyword differences be-

tween insights generated after recitation with

vs.

OLI

Andromeda

Even

is a signiﬁcant term when comparing

NONE

insights. An insight is more likely to belong to (Prob.

↑) the Pre-recitation insights if it uses the term Even more.

Groups Term Beta Std. Err. Prob. ↑ Signif.

PI vs. OLI DIMENSIONALITY -0.612 0.213 PI ***

Weight -0.725 0.258 PI ***

Variable -0.619 0.226 PI ***

Increase -0.738 0.371 PI **

One -0.420 0.252 PI *

CARDINALITY 0.536 0.199 OLI ***

Similar 0.521 0.196 OLI ***

Away 0.492 0.253 OLI *

Far 0.609 0.318 OLI *

5 DISCUSSION

5.1 Changes in Insight Complexity

To address the ﬁrst research question in this study, the

changes in insight complexity are measured per inter-

action type in

Andromeda

. The signiﬁcant increase

in insight dimensionality shows that students are dis-

cussing more dimensions in their responses. The same

increase does not exist for cardinality and relationship

cardinality. Generally, this means that students insights

are more complex in a consistent manner across ver-

sions, except for those using

NONE

who have decreased

relationship cardinality. These ﬁndings suggest that

students are either initially comfortable with cardinal-

ity and relationship cardinality or found these concepts

A single asterisk (*) indicates that

p <= 0.1

, a double

asterisk (**) indicates that

p <= 0.05

, and a triple asterisk

(***) indicates that

p <= 0.01

. See subsection 3.2 for an

explanation of how signiﬁcance is determined.

Evaluating Differences in Insights from Interactive Dimensionality Reduction Visualizations Through Complexity and Vocabulary

163

difﬁcult to understand. Considering that cardinality

and relationship cardinality had averages of 4.84 and

3.76 respectively in the pre-recitation insights, it is

likely that students were initially comfortable with

these concepts. For comparison, dimensionality had

an initial average of 0.85. These results mirror the

ﬁndings in (Self et al., 2014; Zeitz et al., 2018) which

were calculated manually.

While dimensionality does signiﬁcantly increase

across all versions of

Andromeda

, the average dimen-

sionality is only 2.06. This means that insights on

average describe approximately 2 dimensions of the

data. Thus, while dimensionality scores increase, they

do not increase enough such that the average insight

can be considered high-dimensional (

> 2

dimensions).

5.2 Changes in Insight Vocabulary

To address the second research question in this study,

the changes in insight vocabulary are measured per

interaction type in

Andromeda

. Insights generated

with all versions except

NONE

increasingly used the

words

Variable

and

Weight

. These terms are di-

rectly associated with WMDS and their increased us-

age shows that students felt more comfortable with

WMDS concepts at the end of recitation. This learning

was not supported by

NONE

which is expected consider-

ing

NONE

does not support interactive WMDS. Insights

generated with

NONE

do not have any signiﬁcant in-

crease or decrease in word usage directly related to

the data or

Andromeda

. Upon further investigation, in-

sights using

Even

are using the word within the phrase

“even though” as a way to contradict expectations in

an insight. The word

Much

is used as a comparison

word in phrases like “much more”. This conﬁrms that

even with just

NONE

, there is a difference in insight

vocabulary at the beginning and end of recitation.

5.3 Differences in Parametric and

Observation-Level Interaction

The signiﬁcant metric and keyword differences be-

tween insights generated using

and

OLI

are re-

ported in table 4. There is a clear dichotomy in the

results comparing PI and OLI.

For the insight complexity metrics, insights gen-

erated using

describe variables more as shown

by the

Dimensionality

results while insights gen-

erated using

OLI

describe observations as shown by

Cardinality

. There is no signiﬁcant difference in

Relationship Cardinality

between the versions.

There is a similar relationship in the signiﬁcant key-

word differences. The words associated with an in-

crease in probability that an insight was generated

using

are strongly related to WMDS mechanics.

On the other hand, insights generated using

OLI

are

associated with words that relate to the interpretation

of WMDS and relationships within the data.

5.4 Educational Implications

To address the third research question in this study, the

changes in insight complexity and vocabulary per in-

teraction type are considered in an educational context.

The initial educational goal determines whether the

recitation aided successful learning. Results showed

that insight complexity across

OLI

, and

BOTH

Andromeda

improves consistently through increased

dimensionality scores, however cardinality and rela-

tionship cardinality do not signiﬁcantly change. Also,

the average insight has a dimensionality of about 2

which is lower than hoped for. If the goal was to teach

students to think high-dimensionally and understand

WMDS mechanics, this can be considered a success,

however, there is room for improvement. If the goal

was to teach students to understand the concept of sim-

ilarity and relationships within the data, then the met-

rics show that students did not signiﬁcantly improve.

Because the

and

OLI

post-recitation insight compar-

ison showed that

OLI

insights had stronger cardinality,

focusing more on

OLI

in the classroom may prove to

increase the cardinality of insights. Given there are

insight differences based on versions of

Andromeda

the sequential application of learning objectives may

better suit the learning process.

5.5 Visualization Research Implications

Conducting a vocabulary-based insight analysis pro-

vided further context into the insights that was not

captured by the traditional complexity metrics. As

mentioned, this is the ﬁrst use of natural language pro-

cessing in insight-based visualization research. NLP

methods are good for providing quick descriptions of

interesting patterns in the data, however, it is not as

in-depth as manual methods. For example, not all re-

sults were immediately understandable and required

researchers to look more deeply into the insights to

gain appropriate context. This may prove to be an

obstacle if participants generate insights using more

open-ended interactions in terms of language. De-

spite this, the insight analysis method used was able

to calculate traditional metrics used in visualization re-

search. These metric results mirrored previous studies

related to

Andromeda

in education. Along with this,

signiﬁcant terminology used by participants was iden-

tiﬁed to provide a fuller understanding of how students

generate insights.

IVAPP 2023 - 14th International Conference on Information Visualization Theory and Applications

164

6 CONCLUSIONS

There exist potential avenues for future work. Quantita-

tively analyzing natural language is faster than manual

annotation and less subject to researcher bias. The

description of insight differences provided by this

study can give researchers a general understanding

of how different visualizations enable insight genera-

tion. When used in combination with insight complex-

ity metrics, the results provide a more holistic view

of participant insights. The insight analysis method

currently only analyzes insights on a per-word basis.

Extending the method to look at phrases, rather than

individual words, may yield interesting results. In this

case, a phrase-based approach may better capture the

idea of groups of points such as “aquatic animals” or

“physical traits”. Within education research, the insight

analysis may also be helpful to develop an automatic

grading scheme of natural language insights.

The presented case study analysis identiﬁes differ-

ences in insight complexity and vocabulary within

Andromeda

. Across all interaction types available

within

Andromeda

, the dimensionality of insights in-

creases with its usage. While the insights see con-

sistent complexity changes, their vocabulary differs

based on the interaction types available. When compar-

ing insights generated with parametric interaction and

observation-level interaction, it is clear that insights

generated with parametric interaction are associated

with WMDS-related terminology, while insights gener-

ated with observation-level interaction tend to describe

WMDS interpretations of relationships in the data. The

analysis method presented in this work can be applied

and improved to further visualization research that

seeks to understand through automated processes.

REFERENCES

Amar, R. A., Eagan, J. R., and Stasko, J. T. (2005). Low-

level components of analytic activity in information

visualization. In INFOVIS.

Andromeda Website. Andromeda: Semantic interaction

for dimension reduction. https://infovis.cs.vt.edu/

andromeda.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the

false discovery rate: A practical and powerful approach

to multiple testing. Journal of the Royal Statistical

Society Series B (Methodological), 57(1):289–300.

Card, S. K., Mackinlay, J. D., and Shneiderman, B., editors

(1999). Readings in Information Visualization: Using

Vision to Think. Morgan Kaufmann Publishers Inc.,

San Francisco, CA, USA.

Endert, A., Han, C., Maiti, D., House, L., Leman, S., and

North, C. (2011). Observation-level interaction with

statistical models for visual analytics. In Visual An-

alytics Science and Technology (VAST) 2011 IEEE

Conference, pages 121–130.

Gomez, S. R., Guo, H., Ziemkiewicz, C., and Laidlaw, D. H.

(2014). An insight- and task-based methodology for

evaluating spatiotemporal visual analytics. In 2014

IEEE Conference on Visual Analytics Science and

Technology (VAST), pages 63–72.

He, C., Micallef, L., He, L., Peddinti, G., Aittokallio, T., and

Jacucci, G. (2021). Characterizing the quality of in-

sight by interactions: A case study. IEEE Transactions

on Visualization & Computer Graphics, 27(08):3410–

3424.

House, L., Leman, S., and Han, C. (2015). Bayesian Visual

Analytics (BaVA). Journal of Statistical Analysis and

Data Mining, 8(2):1–13.

Kruskal, J. and Wish, M. (1978). Multidimensional Scaling.

SAGE Publications, Inc.

North, C. (2006). Toward measuring visualization insight.

IEEE Computer Graphics and Applications, 26(3):6–9.

O’Brien, T., Ritz, A., Raphael, B., and Laidlaw, D. (2011).

Gremlin: An interactive visualization model for ana-

lyzing genomic rearrangements. IEEE transactions on

visualization and computer graphics, 16:918–26.

Saraiya, P., North, C., and Duca, K. (2005). An insight-

based methodology for evaluating bioinformatics vi-

sualizations. IEEE Transactions on Visualization and

Computer Graphics, 11(4):443–456.

Self, J. Z., Dowling, M., Wenskovitch, J., Crandell, I.,

Wang, M., House, L., Leman, S., and North, C. (2018).

Observation-level and parametric interaction for high-

dimensional data analysis. ACM Transactions on Inter-

active Intelligent Systems, 8(2).

Self, J. Z., House, L., Leman, S., and North, C. (2015). An-

dromeda: Observation-level and parametric interaction

for exploratory data analysis. Blacksburg.

Self, J. Z., Hu, X., House, L., Leman, S., and North, C.

(2016). Designing usable interactive visual analytics

tools for dimension reduction. In CHI 2016 Work-

shop on Human-Centered Machine Learning (HCML),

volume May, page 7.

Self, J. Z., Self, N., House, L., Evia, J. R., Leman, S.,

and North, C. (2017). Bringing interactive visual an-

alytics to the classroom for developing eda skills. In

Proceedings of the 33rd Annual Consortium of Com-

puting Sciences in Colleges (CCSC) Eastern Regional

Conference, page 10.

Self, J. Z., Self, N., House, L., Leman, S., and North, C.

(2014). Improving students’ cognitive dimensionality

through education with object-level interaction.

Xian, Y., Lampert, C. H., Schiele, B., and Akata, Z. (2019).

Zero-shot learning—a comprehensive evaluation of

the good, the bad and the ugly. IEEE Transactions on

Pattern Analysis and Machine Intelligence, 41(9):2251–

2265.

Zeitz, J., Self, N., House, L., Evia, J. R., Leman, S., and

North, C. (2018). Bringing interactive visual analytics

to the classroom for developing eda skills. Journal of

Computing Sciences in Colleges, 33(3):115–125.

Evaluating Differences in Insights from Interactive Dimensionality Reduction Visualizations Through Complexity and Vocabulary

165