Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data

Nicolas Fuchs, Pierre Vanhulst

, Rapha

el Tuor

and Denis Lalanne

Human-IST Institute, Universit

e of Fribourg, Boulevard de P

erolles 90, Fribourg, Switzerland

Keywords:

Human-Centered Computing, Visualization Design And Techniques, Evaluation Methods.

Abstract:

Is it possible to visualize complex Origin-Destination (OD) data along with relevant spatio-temporal data?

In this paper, we tackle this issue by presenting Flowstrates++, an augmented version of Flowstrates which

aims to visualize additional time-series datasets linked with OD data. On top of Flowstrates’ heatmap, we

designed a second heatmap for spatio-temporal data, synchronized on the temporal axis, as well as other

dataset comparison features. Two versions of Flowstrates++ have been designed and implemented: Switch,

that displays one external dataset at a time, and Combi (for ”combined”), that displays two external datasets

at the same time. We aimed to assess to which extent both variants spur users into making multidimensional

ﬁndings. To achieve this goal, we evaluated both variants with ninety participants: ten were pilot users in

live remote sessions, and eighty were provided by Proliﬁc.co, a crowd-sourcing platform. In a within-groups

study, these participants were asked to take relevant annotations about the data on both variants, and to evaluate

them through a survey. We then classiﬁed the annotations using a framework whose validity was evaluated

with an Intercoder Agreement and Fleiss’ Kappa. We found that the Combi variant yielded consistently better

results, both in terms of number of produced multidimensional annotations, and in terms of appreciation of the

participants. Yet regardless of the variant, our solution allows users to highlight potential correlations between

time-series data and temporal OD data.

1 INTRODUCTION

For several decades, the domain of Origin-

Destination (OD) Data has seen the rise of various

visualization paradigms that allow one to process

and understand large ﬂows of data across time.

Researchers using these visualizations would usually

formulate hypotheses that require conﬁrmations

through external sources of knowledge: for in-

stance, one could correlate a reduction in outbound

migration from a given country with the political

response of its government after a given climatic

disaster. This study aimed to enhance the afore-

mentioned visualization paradigms, so that they

would display such information directly. But what

would be the impact of this additional information

on the user engagement? Would it be considered

too cumbersome and cluttered? And to what extent

would this ”augmented” visualization actually foster

multi-dimensional observations?

https://orcid.org/0000-0001-5176-8579

https://orcid.org/0000-0002-5276-2459

https://orcid.org/0000-0001-7834-0417

We aim to explore these questions with this arti-

cle. The present paper comprises a literature review,

ﬁrst describing some of the many existing methods for

visualizing ﬂow data, then compiling a list of eval-

uation systems that can be used to assess, as objec-

tively as possible, the relevance of a new visualiza-

tion paradigm. The second section is dedicated to our

proposal, Flowstrates++ (Fuchs, 2022). It comprises

a presentation of its design rationale followed by the

presentation of the program interaction capabilities.

The third section describes the user study that was

carried out in order to validate our hypotheses. We

break down this extensive section in three: ﬁrst, we

detail the environment and settings of the experiment.

Then, we describe our evaluation method based on

Fleiss’ Kappa (Fleiss, 1971). Finally, we present our

results. The last section of this paper opens up a dis-

cussion based on our results, highlighting new venues

for further improvements of Flowstrates++.

Fuchs, N., Vanhulst, P., Tuor, R. and Lalanne, D.

Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data.

DOI: 10.5220/0012252300003660

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2024) - Volume 1: GRAPP, HUCAPP

and IVAPP, pages 625-636

ISBN: 978-989-758-679-8; ISSN: 2184-4321

625

2 LITERATURE REVIEW

2.1 Visualization Methods

Dynamic geospatial network representation is a sub-

category of network representation. Such a network

consists of nodes with a position in a given space, and

links between them. Because a network is dynamic,

links and nodes are time-evolving: they may, or not,

exist at some point in time. Links themselves can be

binary or weighted.

The visualization of dynamic geospatial network

data poses several challenges, and literature reports

that no visualization method is satisfactory to repre-

sent both the spatial and temporal dimensions (An-

drienko et al., 2017). Three main techniques al-

low one to represent dynamic geospatial network

data (Kjellin et al., 2008): 2D map projections, an-

imations, and space-time cubes. In this section, we

give an overview of related work in the area of vi-

sualization methods for dynamic geospatial network

visualization. Based on the authors’ tool descriptions

and our practical evaluation of the tools (when possi-

ble), we highlight their key features using four design

space dimensions deﬁned by Sch

ottler et al. (Sch

ottler

et al., 2021):

• GEO: at which level is the geographic information

displayed explicitly?

• NET: at which level is the network information

(nodes and links) displayed explicitly?

• COMP: how are geographic and network informa-

tion visually laid out on the screen?

• INTERACT: to what extent does the visualization

method requires the user to interact with it in order

to extract information?

On top of these key features, we also assessed the

ability of each tool to integrate time-series with dy-

namic geospatial network data, in order to evaluate

their suitability for our study.

2.1.1 2D Map Projection

Flowstrates (Boyandin et al., 2011) (Figure 4) imple-

ments the OD ﬂow map technique: it displays the

same geographical information twice, by placing ori-

gin nodes on the left map, and destination nodes on

the right map, with no distortion or loss of geographic

information (GEO: mapped, NET: explicit, COMP:

superimposed and juxtaposed, INTERACT: not re-

quired). It uses juxtaposition: a central heat map

connects the OD links and displays the evolution of

each ﬂow on the horizontal axis. This central heat

map allows one to visually assess the way the links

evolve over time, and avoids the clutter by displaying

the links in a vertical list. Interaction allows the user

to ﬁlter the data and visualize only the relevant nodes

and ﬂows.

More ﬂow types can be added in the central heat

map by splitting it into two categories, one for each

ﬂow. The advantage with this method is that the read-

ability of each ﬂow at a given time step remains high

even in this complex dataset.

FlowMapper.org (Koylu et al., 2022; Tobler,

1987) (Figure 1) is another example of implemen-

tation of the ﬂow map technique (GEO: mapped,

NET: explicit, COMP: superimposed, INTERACT:

required). FlowMapper.org allows users to upload

their own data as CSV ﬁles to create customized ﬂow

maps. It also supports the customization of the ﬂow

symbols, such as curved ﬂow lines, which allows

users to optimize map readability. Despite its abil-

ity to add supplementary layers to help bring context

to the ﬂow patterns (node symbols, choropleth maps,

base maps), it does not allow the user to add more than

one external dataset. Overall, this technique focuses

on providing users with elegant static and interactive

maps, but does not allow exploration of the tempo-

ral dimension as it only displays one time frame at

a time. Without edge bundling, FlowMapper is sub-

ject to visual clutter problems when displaying sev-

eral origins and destinations at the same time, and this

problem would be exacerbated when coupling addi-

tional ﬂows.

EvoFlows (Cuenca et al., 2019) (see Figure 2) jux-

taposes two complementary views: the MultiStream

view shows the evolution of inﬂows and outﬂows over

time, and a spatial view using Flow Maps displays ge-

ographic locations and directions of ﬂows for a given

time interval (GEO: mapped and abstract, NET: ex-

plicit, COMP: superimposed and juxtaposed, INTER-

ACT: required). This method is well adapted to add

more ﬂows to the temporal view. It is however sub-

ject to scalability issues (Cuenca et al., 2019), linked

to the height of the screen: each added ﬂow will result

in a reduced ability to assess its evolution over time,

as its value at each time step is encoded on a vertical

axis. Another drawback of this tool is that it does not

allow the user to add any external time-series dataset.

MapTrix (Yang et al., 2016) has been proposed as

a way to visualize many-to-many ﬂows by connect-

ing origin and destination maps with an OD matrix

(GEO: mapped and abstract, NET: explicit, COMP:

juxtaposed, INTERACT: not required). Their user

study showed the advantage of both MapTrix and OD

Maps compared to the Bundled Flow Map in lookup,

comparison and ﬂow distribution analysis tasks. The

main design difference between MapTrix and Flow-

IVAPP 2024 - 15th International Conference on Information Visualization Theory and Applications

626

strates is that the former does not allow one to visual-

ize the temporal evolution of ﬂows: each matrix cell

in MapTrix corresponds to a single ﬂow, in a single

time frame.

MobilityGraphs (Von Landesberger et al., 2015)

provides a way to explore time-varying ﬂow data with

a large count of time steps and OD ﬂows (GEO:

mapped, NET: explicit, COMP: superimposed, IN-

TERACT: not required). By performing spatial and

temporal clustering, it reveals movement patterns that

would be occluded in ﬂow maps. This approach ap-

pears to be well-suited for the identiﬁcation of spatial

patterns that deﬁne the underlying spatial structure of

mobility.

DOSA (van den Elzen and van Wijk, 2014) (GEO:

mapped and abstract, NET: explicit, COMP: superim-

posed and juxtaposed, INTERACT: not required) de-

veloped a solution to analyze the structure and multi-

ple variables of a data network using the DOSA sys-

tem (from Detail to Overview via Selections and Ag-

gregations). The user can create a selection of interest

using manual interactions or an automatic ﬁlter, and

DOSA produces a high-level map intended mainly

for non-expert users. To support their study, the au-

thors presented several real-world datasets to express

the effectiveness of their method. Evaluations with

actual users were not reported. Their solution demon-

strates how we can integrate multiple variables along

with ﬂow data. These variables can be expressed in

the form of external datasets. Authors report that this

system is subject to clutter when performing several

selections.

Figure 1: FlowMapper, an implementation of the ﬂow map

technique by Tobler (Tobler, 1987). The display of several

OD ﬂows generates heavy clutter.

Figure 2: Screenshot of EvoFlows, a combination of a tem-

poral and a spatial view to display refugee ﬂows (Cuenca

et al., 2019).

2.1.2 Space-Time Cube

This method (Kapler and Wright, 2005), which can

be seen in Figure 3, makes use of the third dimen-

sion to represent the evolution of a given attribute

over time (GEO: mapped, NET: explicit, COMP: su-

perimposed, INTERACT: required). The ﬁrst issue

with this method lies in its composition and the re-

quirement for the user to interact: the superimposition

of both time and geographic information implies that

both nodes and OD links are visually overlaid, gen-

erating visual clutter. This forces the user to interact

in order to get a clearer overview of the full dataset.

The second issue lies in its three-dimensionality: pro-

jecting three dimensions on a 2D display generates an

ambiguity in the perception of distances and slopes,

and thus altering or preventing the gathering of in-

sights. Adding an extra ﬂow on this type of visual-

ization method is likely to cause even more clutter, as

the ﬂows are superimposed on the geography.

Figure 3: Implementation by Eccles et al. (Eccles et al.,

2007) of the GeoTime space-time cube designed by Kapler

and Wright (Kapler and Wright, 2005). Multiple paths are

displayed to show the movements of individuals over time.

2.1.3 Map Animation

This type of visualization method involves the an-

imation of a map over time to reﬂect changes

(GEO: mapped, NET: explicit, COMP: composi-

tion, INTERACT: required). Animated choropleth

maps (Fish et al., 2011) are an example of its imple-

mentation. The facility of use heavily depends on the

users’ level of change blindness, that is, their ability

to get an overview of the evolution of OD links at

multiple locations over time. Pe

na et al. (Pe

na-Araya

et al., 2020) focused on propagation visualization and

compared the effectiveness of an animated map, small

multiple maps, and a single map with glyphs for dif-

ferent types of tasks. They found out that animated

maps do not perform better than the two alternatives

Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data

627

for the comparison of consecutive time-steps. They

outperform the two alternatives regarding propagation

direction tasks. Regarding the search in large time

intervals and detection of peaks over the whole time

interval, animated maps performed worst. This ﬁnd-

ing is further supported by Boyandin et al. (Boyandin

et al., 2012).

In summary, our literature review has revealed that

a variety of tools allows one to visualize dynamic net-

work and multivariate data. However, none of these

systems allows the analysis of multidimensional time-

evolving networks coupled with additional geospatial

temporal data, while avoiding visual clutter and over-

lap. These features are the foundation of the design

of Flowstrates++. We decided to use a 2D projection,

and to display geographic and network information

explicitly (GEO and NET: explicit) for the OD-data

to decrease the user’s cognitive load. We also chose

to juxtapose external temporal datasets in a dedicated

heat map at the center of the display to avoid clut-

ter problems (COMP: superimposed and juxtaposed).

Reviewed work (Boyandin et al., 2012; Pe

na-Araya

et al., 2020) indicates that animated maps are not op-

timal for analysis tasks in large time intervals, and re-

quire the user to interact with the tool in order to ex-

plore the data. Since we wanted to foster the creation

of time-related insights related to multiple datasets,

we choose to use a static map, and to display the tem-

poral axis on a horizontal scale in the central heat map

(INTERACT: not required).

2.2 Evaluation Methods

Scientiﬁc literature provides an abundance of papers

describing how necessary, yet how difﬁcult it is to

assess a visualization system thoroughly (Isenberg

et al., 2013; Lam et al., 2012). For this study, we se-

lected a few papers that drove our initial research. Al-

though it focuses mostly on software visualizations,

the study of Merino et al. (Merino et al., 2018) pro-

vides a state-of-the-art overview of existing evalua-

tion methods, and claims that more than half of the

papers reviewed in the domain lack thorough evalu-

ations. The authors further provide a comprehensive

deﬁnition of evaluation strategies, may they be theo-

retical (as evidenced by Munzner’s work (Munzner,

2014)), or empirical, relying on different strategies to

gather, then statistically analyze data. The study of

Merino et al. provided the initial basis of our eval-

uation protocol, described in section 4. Of the two

dependent variables used to assess a visualization sys-

tem, user performance is divided into two categories,

one being the time needed to produce an annotation

and the other being the correctness of the observation.

While interesting, these criteria can hardly be gener-

alized to many cases including ours, as they require

answers whose validity can be objectively demon-

strated. As an exploratory information visualization,

Flowstrates++ does not aim to foster insights that ﬁt

into this category.

To overcome this limitation, we searched for ways

to qualify the observations fostered by Flowstrates++

without relying on their perceived correctness. Two

prior research projects (Boyandin, 2013; Vanhulst

et al., 2019) allowed us to build the core of our evalua-

tion protocol. Boyandin et al. (Boyandin, 2013) qual-

iﬁed 285 annotations produced on the original Flow-

strates by 16 users, using 4 dimensions: geospatial

scope, temporal scope, validity, and reasoning. The

last two dimensions were binary, limiting as much as

possible the risk of disagreement. As most annota-

tions provided by participants were trivial, the validity

turned out to be easy to assess. Vanhulst et al. (Van-

hulst et al., 2019) qualiﬁed the types of 302 annota-

tions produced by 16 participants on 4 visualizations,

and provided a classiﬁcation framework, validated by

an Intercoder Agreement and a Fleiss’ kappa. This

classiﬁcation framework is meant to be as general as

possible, but the study only proposes toy examples.

In further studies, the authors highlighted how difﬁ-

cult it is to assess annotations with multiple obser-

vations (Vanhulst et al., 2019), and proposed Colvis,

an interface to classify them automatically (Vanhulst

et al., 2021). Our evaluation protocol is thus rooted in

the original Flowstrates’ evaluation protocol, and was

enriched by the research led on Colvis.

3 DESIGN RATIONALE

Figure 4: The original Java program, Flowstrates (Boyandin

et al., 2011).

IVAPP 2024 - 15th International Conference on Information Visualization Theory and Applications

628

3.1 The Concept of Flowstrates

Flowstrates tackles the challenge of visualization of

temporal origin-destination data. It differs from a

standard directional ﬂow map as the goal is to dis-

play ﬂow magnitudes over time. These aims lead to a

visualization where origins are located on a map (left-

hand side) and destinations are located on a separated

map (right-hand side) (see Figure 4). The ﬂow mag-

nitudes are encoded with a heatmap located between

the two distinct maps. Each row represents an origin-

destination pair, and each column represents a times-

tamp. Heatmap rows are then linked to the maps (both

origin and destination) using straight non-directional

colored lines. This allows users to analyse evolution

over time without resorting to animation. Origin and

destination entities are indeed preserved geographi-

cally, but distance between them is however not pre-

served due to the heatmap display.

As datasets can have a huge number of data en-

tries, it is essential for the system to be ﬁtted with

interaction capabilities. Each map can be individu-

ally navigated via zoom and pan behavior. Geograph-

ical entities can be separately selected through direct

selection in combination with an optional key for ag-

gregation mode. The lasso mode is available, to select

several geographical entities by drawing a freehand

line around areas of interest. It can also be used in

combination with aggregation mode. When selected

origins or destinations are updated, the heatmap is

automatically synchronized. Heatmap rows can also

be sorted and/or aggregated by using option buttons.

The difference option allows the user to display the

relative difference of magnitude between consecutive

time values. It is in this case easier to see increasing

(red color) and decreasing (blue color) tendencies.

3.2 Flowstrates++

Flowstrates++ has essentially the same interaction ca-

pabilities as Flowstrates and has been developed with

web technologies. However, an extra heatmap is dis-

played at the top of the already existing centered

ﬂow magnitudes heatmap. It allows one or two other

datasets (geographical temporal data) to be displayed

with the goal to be able to make multi-datasets ob-

servations. A juxtaposition heatmap (see Figure 7)

has been inserted between the top and bottom cen-

tered heatmaps to be able to directly compare two

rows from different datasets. Graphs on the centered

area can also be panned and zoomed independently

on the x and y axes. The bottom and top graphs are

synchronized. The colored lines linking the spatial

entities and their magnitudes is only available for the

OD data, not for the external datasets due to data clut-

ter.

We ensured that the system would work with any

arbitrary objects whose position is deﬁned by geo-

graphical coordinates. These objects are represented

by regular geometric marks (i.e. circles, rectangles,

etc). For example, Figure 6 shows a ﬁctional use

case where the arbitrary objects are the Swiss train

stations.

3.3 Design Discussion

Some questions came to our mind while analyzing

Flowstrates: how could it be enhanced in a way that it

retains the same capabilities to foster insights, while

optimizing its space usage as to add external spatio-

temporal data? How could it be modiﬁed to be-

come a multi-dimensional data visualization tool? As

mentioned in introduction, the ability to compare di-

rectly two or more datasets - one comprising OD data

and several others consisting of spatio-temporal data

- would offer a clear advantage to the analysts. To

maximize Flowstrates’ space usage, we chose to keep

its concept of ”data in the middle” and decided to

split this middle part into two parts: the new, upper

one would display the external data and the bottom

one would display the ﬂow data. As to allow com-

parisons, both parts are synchronized on the tempo-

ral axis. While we considered trying alternatives to

heatmaps for either parts (such as a trellis area chart),

we decided against it: distinguishing the impact of

a new visualization paradigm for the middle part of

Flowstrates is outside of the scope of this study.

Another requirement was to support two external

datasets, rather than just one. This led us to consider

two different interfaces: the Switch version displays

only one external dataset at a time, requiring the user

to manually switch between the available datasets.

Conversely, the Combi version displays two external

datasets simultaneously on the same graph. We chose

to display both datasets intertwined rather than in two

separate heatmaps to keep them as dense as possible.

With these two variants of the interface (see Figure 5),

we formulated the following hypotheses.

• H0: Combi version fosters more multi-dataset

ﬁndings as the users can see both spatio-temporal

datasets at once.

• H1: Combi version is harder to apprehend due to

visual cluttering.

• H2: Both versions foster signiﬁcant insights that

involve two or three datasets.

These hypotheses are veriﬁed by a pilot study (Fuchs,

2022) followed by our experiment.

Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data

629

(a)

(b)

Figure 5: Flowstrates++ versions. (a) Switch version, (b) Combi version.

4 USER STUDY

In relation to H0, we aimed to assess the affordance

of both variations (Combi and Switch) - in our case,

that is how easily they spur users into analyzing mul-

tiple dimensions of the visualization. We evaluated

H1 based on the subjective appreciation of the par-

ticipants through a qualitative questionnaire, as well

as the analysis of the annotations that they produced.

This analysis would also provide statistics regarding

how many annotations speak of several datasets, thus

verifying H2. These aims informed our decision to

use a short controlled experiment with a wide range of

unguided beginner users, as opposed to a longitudinal

study involving a limited set of participants beneﬁting

from a strong learning effect.

4.1 Environment and Settings

Our protocol required participants to make relevant

observations about the data, as if they were data sci-

entists working on the datasets. We purposely gave

no example of annotations, as to avoid any kind of

inﬂuence. Every user was given ten minutes on each

version, before ﬁnishing with a qualitative question-

naire unrestricted in time, comprising binary choice

questions, Likert scales questions and a free comment

section. A summary of the study setting is presented

in Table 1, while the qualitative questionnaire is pre-

sented in Table 2. We used a within-group user set-

ting to counterbalance any learning effect. Half of the

users started with the Switch version and ended with

the Combi version, whereas the other half of the users

completed the study the other way around. The terms

”Step 1” and ”Step 2” found on legend of graphs in

subsection 4.4 refer to the versions order.

The protocol was reﬁned through a pilot study

with ten graduate students in computer science in a

remote setting. Eighty paid participants then took

part in our experiment on Proliﬁc, a crowd-sourcing

platform. They were native English speakers with a

level of education of at least High School or techni-

cal/community college.

4.2 Classiﬁcation Framework

As mentioned in subsection 2.2, we built a classiﬁ-

cation framework on top of the works of Boyandin et

IVAPP 2024 - 15th International Conference on Information Visualization Theory and Applications

630

Figure 6: Use case of Flowstrates++ for the representation of train stations ﬂows.

Figure 7: Interface of Flowstrates++ displaying the juxtaposition graph.

Table 1: Study characteristics.

Part

Group 1 Group 2

Time limit

Part 1

Version Switch Version Combi 10 minutes

Part 2

Version Combi Version Switch 10 minutes

Part 3

Qualitative questionnaire

no limit

Table 2: Qualitative questionnaire.

Question

Version Switch or Combi

Which version is the most intuitive?

Which version did you ﬁnd the most interesting ﬁndings

with?

Which version is easier to work with?

Version Switch and Combi (1 = very bad, 5 = very good)

Useful to discover large-grained ﬁndings (e.g. general

tendencies)

Useful to discover ﬁne-grained ﬁndings (e.g. detailed

observations)

Useful to compare between datasets

al. (Boyandin et al., 2011) and of Vanhulst et al. (Van-

hulst et al., 2019). Our aim was to keep it as simple as

possible, as to maximize Intercoder Agreement, while

making the richest statements possible about the an-

notations. We used a four-dimensions classiﬁcation

framework, whose dimensions and possible values are

described in Table 3. Examples for each value by di-

mension can be found in Table 4, Table 5 and Table 6,

with the exception of the ”datasets” dimension whose

values are self-explanatory (R = Refugee, T = Tem-

perature, W = War deaths, and other values are com-

binations of two or all of these values).

Table 3: Dimensions and their values.

Dimension

Values

Interpretation

visual | data | meaning/correlation

Spatial

country | region | country-country | country-region |

region-region | global

Temporal

one year | year-year | until/since | interval | all time

Datasets

R | T | W | R+T | R+W | T+W | R+T+W

4.3 Evaluation of the Classiﬁcation

Framework

Three coders, among the authors of this paper, used

the classiﬁcation framework to qualify the annota-

tions without being inﬂuenced by the others. Once

Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data

631

Table 4: Interpretation dimension.

Value

Example

Visual

For the ﬂows originating from USA, there is much

more red colors towards the last decade.

Data

Between 1990 and 1998, there is a high peak of mi-

grations originating from Russia going from 117736

and 172724 refugees.

Meaning/

correlation

Since 2016, there is a very low number of refugees

coming from USA. It may be explained by the presi-

dential election.

Table 5: Spatial dimension.

Value

Example

country

There is a lot of war deaths in Brazil in 1985.

region Europe is getting hotter each year.

country-

country

There is a peak in refugees from USA to France in

2003.

country-

region

Refugees coming from Canada migrate mainly to

Spain, France and Portugal.

country-

global

Switzerland is the preferred destination for refugees.

region-

region

North American refugees don’t migrate much to Asia.

region-global The top migration destinations are in western Europe.

global Temperatures are rising all around the world.

Table 6: Temporal dimension.

Value

Example

one year

In 1968 there is a huge negative peak of refugees com-

ing from China.

year-year

Concerning the ﬂows of refugees from Venezuela to

UK, the years 1992 and 2004 are quite similar.

until/since

In the temperatures dataset, we can observe a serious

increase during the last decade.

interval

In Africa from 1990 to 2000 there are few registered

war deaths.

all time

We can see that as time passes, there are more and

more refugees.

the classiﬁcation process was done, the results of the

three coders were compared, and all disagreements

were discussed. On top of the calculation of an Inter-

coder Agreement, we also decided to further reinforce

our results by using a Kappa, as to take into account

chance-agreement. Cohen’s Kappa and Scott’s Pi be-

ing limited to only two coders, we relied on Fleiss’

Kappa to this end. Note that while our dimensions

can be considered ordinal, as there is a progressive

increase in the scale of their values, we did not deem

necessary to use Kendall’s tau in our approach: mis-

taking a ”country” for a ”region” in the geospatial di-

mension is not necessarily more erroneous than mis-

taking it for a ”country-country”.

Disagreements were of various natures: some

turned out to be simple misreadings, in which case

they were directly corrected. Some others were due

to the lack of domain-knowledge from the coders: a

few dozen of annotations mention start and end dates

of the datasets, for instance, and could thus be classi-

ﬁed as both ”all-time” or ”interval” depending on the

interpretation of the coder. These were also agreed

upon and corrected directly, as they do not question

the classiﬁcation framework itself. There were some

disagreements, however, that proved to be more fun-

damental. In these cases, the coders would consider

the disagreement as ”real” and report it, although they

would also agree on a corrected value to derive statis-

tics for the study’s results presented in subsection 4.4.

At the end of the process, we obtained both percent-

age agreements and kappas for each classiﬁcation di-

mension concerning all ﬁndings.

Table 7: Intercoder Agreement.

Dimension

Interpretation Spatial Temporal

Datasets

Full

Value

97.22% 94.91% 93.21% 99.23%

77.67%

Table 8: Fleiss’ Kappa.

Dimension

Interpretation Spatial Temporal

Datasets

Full

Fleiss κ

85.89% 93.22% 90.92% 97.66%

71.10%

When comparing the Intercoder Agreement and

Fleiss’ Kappa results in Table 7 and Table 8, we ob-

serve that all scores are above 70%. For both sta-

tistical methods, the most agreed upon dimension is

Datasets. It is an expected result as dimension val-

ues contain binary components: a ﬁnding may speak

of a certain dataset or not. The difference of order

between the three remaining dimensions are due to

the fact that the Fleiss’ Kappa takes the number of

classiﬁers as well as the number of dimensions val-

ues into account. The ”Full dimension” of the Inter-

coder Agreement is computed as follows: if a dis-

agreement is found in one of the four dimensions,

the ”Full dimension” is considered as a disagreement.

Conversely, the ”Full dimension” of Fleiss’ Kappa is

computed by multiplying each dimension score. The

most time-consuming dimensions to classify were the

Spatial and Temporal ones, although the Interpreta-

tion dimension suffers from a larger discrepancy be-

tween the percent agreement and the Kappa. The

most critical disagreements are discussed in section 5.

4.4 Study Results

The participants of the main study produced a to-

tal of 647 annotations. Figure 8 clearly shows that

the majority of the users tend to make observations

concerning only one single dataset. Among these,

the same order of datasets is preserved between both

IVAPP 2024 - 15th International Conference on Information Visualization Theory and Applications

632

Figure 8: Datasets involved in the ﬁndings.

Combi and Switch versions: observations are ﬁrstly

made on refugees, then war deaths and ﬁnally tem-

peratures. There are consistently more multi-dataset

observations for Combi version than Switch version.

The difference is the most signiﬁcant with the ”T, W”

class. This result was expected as Combi version dis-

plays both datasets on the same graph.

Figure 9: Total number of ﬁndings made during the experi-

ment, grouped by temporal dimension.

Figure 9 gives some interesting information on the

temporal dimension. The graph concludes that most

ﬁndings are related to a speciﬁc year. The second

most used temporal dimension value is ”all time”.

There is not much difference between the last three

values. Both Combi and Switch versions have a simi-

lar pattern.

Figure 10 presents the spatial information of ob-

servations. The Switch version seems more likely

to draw attention to OD spatial values (country-

country, country-global, etc), whereas the Combi ver-

sion seems to encourage more ﬁndings on single spa-

tial values (country, global, and region). We surmise

that the combined dimensions of the upper matrix of

the Combi version attracts more attention. Since the

upper matrix does not display OD data, it seems log-

ical that the users focused on single spatial values in-

Figure 10: Total number of ﬁndings made during the exper-

iment, grouped by spatial dimension

Figure 11: Total number of ﬁndings made during the exper-

iment, grouped by level of interpretation

stead.

Figure 11 represents the level of interpretation.

”Visual” category is almost empty, which is what we

expected. Indeed users were asked to make some ob-

servations about the data. Concerning the two other

categories, there is a huge difference between data

and meaning/correlation, which was also expected.

About the last category, Combi version seems to be

more appropriate to make some more elaborated ob-

servations, either with a given context explaining the

data or with the correlation between several datasets.

This ﬁgure also shows that the Combi version fostered

over three times more ”meaning/correlation” observa-

tions: Users tended to bridge the datasets and make

sense of data more easily when seeing them both, de-

spite a potential visual clutter.

Figure 12 represents the number of ﬁndings in-

volving several datasets and having the value mean-

ing/correlation as interpretation. Firstly, we can log-

ically observe that there are way more ﬁndings in-

volving two datasets than three. There is also a huge

difference between Switch and Combi versions, inde-

pendently of the steps order. The difference is lower

for the ﬁndings involving three datasets.

Figure 13 displays the number of votes for the

ﬁrst three questions, binary choices between Switch

and Combi versions. We systematically observe that

Combi version is easily the preferred choice for all

Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data

633

Figure 12: Total number of ﬁndings involving more than one dataset made during the experiment.

Figure 13: Results for the ﬁrst three questions of the quali-

tative questionnaire (Switch or Combi version):

Q1 - Which version is the most intuitive?

Q2 - Which version did you ﬁnd the most interesting ﬁnd-

ings with?

Q3 - Which version is easier to work with?

Order Switch->Combi Order Combi->Switch

Ratings (1-5)

Q4 Q5 Q6

Switch

Combi

Version

Q4 / Q5 / Q6

Figure 14: Results for the last three questions of the quali-

tative questionnaire (1-5 Likert scales):

Q4 - Useful to discover large-grained ﬁndings (e.g. general

tendencies).

Q5 - Useful to discover ﬁne-grained ﬁndings (e.g. detailed

observations).

Q6 - Useful to compare between datasets.

three questions. It should be noted that there is close

to no evidence that the users chose the second step of

their experiment. In that sense, users did not seem to

be affected by a supposed learning effect.

Figure 14 displays the median ratings for the last

three questions, Likert scales from one to ﬁve. We

can observe that participants who started with order

Combi to Switch signiﬁcantly preferred the whole ex-

periment, including the second part with the Switch

version, whereas those who started with order Switch

to Combi had a more mixed appreciation. Overall,

however, we see that the Combi version is just as pop-

ular as the Switch version, which came as a surprise.

We expected a much bigger difference for the very

last question in favor of the Switch version, thanks to

it being less visually cluttered.

Figure 15: Flowstrates++ learning curve

Figure 15 shows the number of ﬁndings accord-

ing to the step order and the version of the program.

Firstly we observe that the total number of ﬁndings

between step 1 and step 2 is very similar. If we sum up

the number of ﬁndings according to the order of ver-

sions, we obtain 329 ﬁndings for Combi -> Switch

and 318 for Switch -> Combi. The progression or

learning curve is close.

5 DISCUSSION

Flowstrates++ currently features up to two external

datasets, mostly because of the Combi version where

both datasets are displayed at the same time. Since

our results seem to indicate that the Combi ver-

sion is signiﬁcantly more appreciated, further stud-

ies should investigate how we could display more

than two datasets simultaneously without confusing

the users. In this regard, Edge bundling (Bourqui

et al., 2016; Phan et al., 2005) could be a useful ad-

dition to Flowstrates++, in order to lower edge clut-

ter that appears when a large number of nodes is se-

lected. Using bigger bins (e.g. grouping by decade)

IVAPP 2024 - 15th International Conference on Information Visualization Theory and Applications

634

for the matrices could also alleviate the visual load of

the users, although our results show that the most clut-

tered version of our interface (Combi) was preferred

over Switch.

In regard to our hypotheses, H0 has been clearly

demonstrated, as can be seen by Figure 12. Display-

ing both spatio-temporal datasets at the same time al-

lowed users to reﬂect on both at once, thus resulting in

a fairly high number of ”T,W” annotations, as seen in

Figure 8. Similarly, ”R,T,W” annotations were also

notably more numerous with the Combi version, as

all three datasets could be analyzed simultaneously.

However, the reasons why users provided more ”R,T”

and ”R,W” annotations as well are yet to be explored

in further studies. H1 was however contradicted by

our results, as participants show no evidence of pre-

ferring the Switch version over the Combi version.

Figure 13 and Figure 14 instead show a clear prefer-

ence towards the Combi version. Further qualitative

evaluations could explain this surprising result.

H2 was the main point of the design of Flow-

strates++, and our study shows that approximately

10% of the annotations that our participants provided

(63 out of 647) involved two or more datasets. Keep-

ing in mind that the participants were purposely not

required to make any multi-datasets observations, this

score proves that the interface still manages to foster

insights that leverage more than a single dataset. We

also believe that this score might change in further

studies, involving longer tasks and ﬁeld experts.

During the evaluation of Flowstrates++, we fo-

cused on gathering as many annotations as possible,

and thus enlisted a much larger group of participants

than most similar studies (Merino et al., 2018). These

participants were not actual data visualization experts,

nor users of Flowstrates++ or equivalent solutions,

and this might have impacted the nature of the anno-

tations they captured. Moreover, the limited amount

of time spent on each version of Flowstrates++ could

have had a similar impact, although our data shows

no signiﬁcant learning effect. Finally, our study con-

ducted on Proliﬁc yielded a very high return rate

(69.69%), signaling that the tasks that we submitted

were probably too unusual and time-consuming com-

pared to the average tasks proposed on this platform.

One learning of this study is that crowd-sourcing plat-

forms are thus not particularly ﬁtted to host com-

plex analysis tasks like ours. We would thus need

to conduct studies with actual domain experts to fur-

ther assess the beneﬁts of Flowstrates++ more efﬁ-

ciently (Yalc¸ın et al., 2018).

The classiﬁcation framework that we used also

raised several challenges. While it worked well for

the original Flowstrates, the four-dimensions classi-

ﬁcation framework lacked the necessary abstraction

to handle datasets of different natures. Notably, the

values of the geospatial dimension would have dif-

ferent meaning depending on whether the mentioned

datasets were only of spatio-temporal nature, or if

they included the OD ﬂow. A ”country-country”

value in a spatio-temporal dataset would mean ”a

comparison between two countries” or, simply put,

a comparison between two single ”data units”. On

the contrary, a similar value on the OD dataset would

rather indicate a ﬂow between two countries - and

ﬂows are the ”single data units” of the dataset. The

two patterns are contradictory: in the ﬁrst case, the

user actively compares two units together, while the

latter case simply asks the user to qualify a single unit.

We argue that our evaluation approach would work

best with a more abstract classiﬁcation framework,

similar to Vanhulst et al’s (Vanhulst et al., 2019), de-

spite its high level of complexity. Another recurrent

limitation of analyzing annotations is the presence of

annotations with multiple observations. Our attempt

at keeping the classiﬁcation framework as simple as

possible backﬁred when we refused to split anno-

tations into several observations, then proceed with

the classiﬁcation of these observations. The prob-

lem remains to decide objectively when an annotation

should be split or not.

6 CONCLUSION

Our work is based on an existing application called

Flowstrates, that presented a novel technique to dis-

play temporal OD data. We augmented Flowstrates

by adding up to two external spatio-temporal datasets.

This enabled analysts to ﬁnd potential correlations be-

tween datasets, something that was not possible with

the original Flowstrates. We designed and imple-

mented the program with web technologies, making

it easier to deploy and reach a larger target audience.

We came up with two versions of the program: one

where the user has to manually switch between ex-

ternal datasets (Switch) and one where both datasets

are displayed on the same graph (Combi). We led a

prior pilot study with ten students. To reinforce our

results, we then extended that study to eighty users,

gathered via a crowd-sourcing platform. This latter

study asked participants to take unguided annotations

that were recorded and analyzed according to a clas-

siﬁcation framework built on top of prior studies.

Our results show that the Combi version per-

formed signiﬁcantly better both in terms of annota-

tions production and in terms of satisfaction, conﬁrm-

ing H0, while invalidating H1. Regarding H2, our

Flowstrates++: An Approach to Visualize Multi-Dimensional OD Data

635

non-expert users managed to produce annotations of

which 10% referred to more than a single dataset.

With this study, we managed to design, implement

and evaluate a novel visualization system to com-

pare complex temporal OD data and arbitrary spatio-

temporal datasets.

REFERENCES

Andrienko, G., Andrienko, N., Chen, W., Maciejewski, R.,

and Zhao, Y. (2017). Visual analytics of mobility and

transportation: State of the art and further research

directions. IEEE Transactions on Intelligent Trans-

portation Systems, 18(8):2232–2249.

Bourqui, R., Ienco, D., Sallaberry, A., and Poncelet, P.

(2016). Multilayer graph edge bundling. In 2016

IEEE Paciﬁc Visualization Symposium (PaciﬁcVis),

pages 184–188. IEEE.

Boyandin, I. (2013). Visualization of temporal origin-

destination data.

Boyandin, I., Bertini, E., Bak, P., and Lalanne, D. (2011).

Flowstrates: An approach for visual exploration of

temporal origin-destination data. Comput. Graph. Fo-

rum, 30:971–980.

Boyandin, I., Bertini, E., and Lalanne, D. (2012). A qual-

itative study on the exploration of temporal changes

in ﬂow maps with animation and small-multiples. In

Computer Graphics Forum, volume 31, pages 1005–

1014. Wiley Online Library.

Cuenca, E., UCLouvain, Docquier, F., Nijssen, S., and

Schaus, P. (2019). Evoﬂows: an interactive approach

for visualizing spatial and temporal trends in origin-

destination data.

Eccles, R., Kapler, T., Harper, R., and Wright, W. (2007).

Stories in geotime. In 2007 IEEE Symposium on Vi-

sual Analytics Science and Technology, pages 19–26.

Fish, C., Goldsberry, K. P., and Battersby, S. (2011).

Change blindness in animated choropleth maps: An

empirical study. Cartography and Geographic Infor-

mation Science, 38(4):350–362.

Fleiss, J. (1971). Measuring nominal scale agree-

ment among many raters. Psychological bulletin,

76(5):378—382.

Fuchs, N. (2022). Flowstrates++: a visualization tool for

multi-dimensional temporal origin-destination data.

Isenberg, T., Isenberg, P., Chen, J., Sedlmair, M., and

oller, T. (2013). A Systematic Review on the

Practice of Evaluating Visualization. IEEE Trans-

actions on Visualization and Computer Graphics,

19(12):2818–2827.

Kapler, T. and Wright, W. (2005). Geo time information vi-

sualization. Information Visualization, 4(2):136–146.

Kjellin, A., Pettersson, L. W., Seipel, S., and Lind, M.

(2008). Evaluating 2d and 3d visualizations of spa-

tiotemporal information. ACM Trans. Appl. Percept.,

7(3).

Koylu, C., Tian, G., and Windsor, M. (2022). Flowmapper.

org: a web-based framework for designing origin–

destination ﬂow maps. Journal of Maps, pages 1–9.

Lam, H., Bertini, E., Isenberg, P., Plaisant, C., and Carpen-

dale, S. (2012). Empirical studies in information vi-

sualization: Seven scenarios. IEEE Transactions on

Visualization and Computer Graphics, 18(9):1520–

1536.

Merino, L., Ghafari, M., Anslow, C., and Nierstrasz, O.

(2018). A systematic literature review of software vi-

sualization evaluation. Journal of Systems and Soft-

ware, 144:165–180.

Munzner, T. (2014). Visualization analysis and design.

CRC press.

na-Araya, V., Bezerianos, A., and Pietriga, E. (2020).

A comparison of geographical propagation visualiza-

tions. In Proceedings of the 2020 CHI Conference on

Human Factors in Computing Systems, pages 1–14.

Phan, D., Xiao, L., Yeh, R., and Hanrahan, P. (2005). Flow

map layout. In IEEE Symposium on Information Visu-

alization, 2005. INFOVIS 2005., pages 219–224.

Sch

ottler, S., Yang, Y., Pﬁster, H., and Bach, B. (2021).

Visualizing and interacting with geospatial networks:

A survey and design space. In Computer Graphics

Forum, volume 40, pages 5–33. Wiley Online Library.

Tobler, W. R. (1987). Experiments in migration mapping by

computer. The American Cartographer, 14(2):155–

163.

van den Elzen, S. and van Wijk, J. J. (2014). Multivari-

ate network exploration and presentation: From de-

tail to overview via selections and aggregations. IEEE

Transactions on Visualization and Computer Graph-

ics, 20(12):2310–2319.

Vanhulst, P., Evequoz, F., Tuor, R., and Lalanne, D. (2019).

A descriptive attribute-based framework for annota-

tions in data visualization. In Bechmann, D., Chessa,

M., Cl

audio, A. P., Imai, F., Kerren, A., Richard, P.,

Telea, A., and Tremeau, A., editors, Computer Vision,

Imaging and Computer Graphics Theory and Appli-

cations, pages 143–166, Cham. Springer International

Publishing.

Vanhulst, P., Tuor, R.,

equoz, F., and Lalanne, D. (2021).

Colvis—a structured annotation acquisition system

for data visualization. Information, 12(4).

Von Landesberger, T., Brodkorb, F., Roskosch, P., An-

drienko, N., Andrienko, G., and Kerren, A. (2015).

Mobilitygraphs: Visual analysis of mass mobility

dynamics via spatio-temporal graphs and clustering.

IEEE transactions on visualization and computer

graphics, 22(1):11–20.

Yalc¸ın, M. A., Elmqvist, N., and Bederson, B. B. (2018).

Keshif: Rapid and expressive tabular data exploration

for novices. IEEE Transactions on Visualization and

Computer Graphics, 24(8):2339–2352.

Yang, Y., Dwyer, T., Goodwin, S., and Marriott, K. (2016).

Many-to-many geographically-embedded ﬂow visual-

isation: An evaluation. IEEE transactions on visual-

ization and computer graphics, 23(1):411–420.

IVAPP 2024 - 15th International Conference on Information Visualization Theory and Applications

636