Exploring Flow Metrics in Dense Geographical Networks

Valentino Di Donato, Maurizio Patrignani and Claudio Squarcella

Department of Engineering, Roma Tre Unviersity, 79 Via della Vasca Navale, Rome, Italy

Keywords:

Geographic Visualization, Flow Visualization, Geographical Networks, Time Series Data.

Abstract:

We present FLOWMATRIX, a system for the interactive exploration of time-labeled multivariate ﬂows be-

tween pairs of geographic locations. FLOWMATRIX offers a coordinated visualization based on the interplay

between a geographic map and a matrix that allow to discover trends tied to speciﬁc locations while offering

an overview of metrics of the ﬂows between all pairs of locations. The input data is clustered following a

geographic hierarchy and the user can navigate between different levels of detail. The design of our system

privileges the execution of simple tasks like assessing the volume and features of the ﬂows between pairs of

locations, enumerating destinations with poor performance, and sorting ﬂow streams based on their volume.

1 INTRODUCTION

The research presented in this paper is the result of

a collaboration between a research group focused on

information visualization and a big player in VoIP

telecommunications that needs to assess and moni-

tor the trafﬁc load and the quality of service (QoS)

offered to its customers. In particular, the company

is interested in understanding the relationship among

the volume of trafﬁc, the QoS, and the geolocation of

the components of its infrastructure.

Logs of exchanges of trafﬁc ﬂows are collected in

the order of millions of records per day. Such data

is intended to be used for ex-post analysis, support-

ing different levels of detail, as the company is truly

distributed worldwide. Also, since the data sets are

huge, the user should be able to ﬁlter them at least

with respect to the geography, the speciﬁc time in-

terval of interest, and the different performance met-

rics available. Finally, since communication ﬂows are

inherently directional, the user should be allowed to

perceive the relationship between the quality of the

communication and its direction.

The main challenges lie in the density of the net-

works under examination, in the need of binding the

information to a geographic map, and in the require-

ment of looking at data at different abstraction levels.

On top of that, the input data becomes more interest-

ing when it is put in a historical perspective and en-

riched with many different facets and key indicators.

In our paper we present a system aimed at fac-

ing the above challenges. It is called FLOWMATRIX

and it is designed for the interactive exploration of

time-labeled multivariate ﬂows between pairs of ge-

ographic locations. FLOWMATRIX offers a coordi-

nated visualization based on the interplay between a

geographic map and a square matrix that allow to dis-

cover trends tied to speciﬁc locations while offering a

general overview of metrics of the ﬂows between all

pairs of locations.

The paper is organized as follows. Section 2 con-

tains a detailed analysis of user requirements. In Sec-

tion 3 we describe our contribution and in Section 4

we provide some details on its implementation. We

then proceed to describe our use cases in Section 5,

followed by the results of an evaluation study we con-

ducted with domain experts (Section 6). Section 7 ex-

plores the state of the art and ﬁnally, our conclusions

are in Section 8.

2 REQUIREMENT ANALYSIS

Our reference data set is composed of a collection of

data cases, each representing a single message trans-

mission from a source to a destination, that are both

labeled with their geographic location. Such an ex-

change is equipped with a timestamp and a set of

quantitative attributes describing the quality or quan-

tity of exchange (e.g. packet loss, bandwidth).

Several data cases can be grouped together, either

by time or by geography of source and destination,

into a ﬂow where the original quantitative attributes

are naturally aggregated (summed up or averaged de-

pending on the context).

Donato, V., Patrignani, M. and Squarcella, C.

Exploring Flow Metrics in Dense Geographical Networks.

DOI: 10.5220/0006548700520061

In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 3: IVAPP, pages 52-61

ISBN: 978-989-758-289-9

Figure 1: Overview of the interface of FLOWMATRIX.

Informal interviews with the domain experts origi-

nated the following restricted set of requirements, that

were reﬁned through several iterations.

R1: Assessment of quantity and quality of ﬂows.

The interface should convey to the user both the quan-

tity of ﬂows and their quality in terms of performance

indicators. For example, the user may want to imme-

diately discover if a major event, as the deployment

of a software update, impacted the performance of the

infrastructure.

R2: Choice of the aggregation level. Users should

be able to cluster the geographic locations that con-

stitute the domain of the source and destination in

the data cases. Such clustering should be performed

at arbitrary levels of detail, i.e. starting from ﬁne-

grained locations and scaling up to macro-regions of

the world, according to well-known administrative,

political, and geographical boundaries.

R3: Time-oriended exploration. The user should be

allowed to explore the data focusing on the events of

a speciﬁc time instant and, hence, assessing the quan-

tity and quality measures for contemporary ﬂows. For

example, that is the case for ad-hoc analysis triggered

by speciﬁc events like network failures.

R4: Performance-oriended exploration. The sys-

tem should allow the user to focus on performance

thresholds with the aim of discovering periods when

such thresholds have been violated by some ﬂow.

Also, it should be possible to discover in what time in-

tervals the ﬂows have different qualities with respect

to the average scenario.

3 DESIGN OF THE INTERFACE

In this section we introduce FLOWMATRIX. We ﬁrst

describe its interface and its interaction primitives

and then we proceed with a discussion of our design

choices.

3.1 Interface and Interaction

The interface of FLOWMATRIX is presented in Fig. 1.

It is split into four main parts: the control panel (up-

per left corner), the timeline (lower right corner), the

dynamic matrix (lower left corner), and the dynamic

map (upper right corner). In the following description

we consider a sample data set with two quantitative at-

tributes, i.e., round-trip delay and packet loss between

pairs of geographic locations, typical of the domain of

computer networks and telecommunications.

The control panel contains basic information on

the current state of the visualization and a small set

of controls to change the selection and representation

of quantitative attributes. From the control panel in

Fig. 1 we can infer that the focus is on a speciﬁc

day (5 Sept 2013) and we are looking at the ﬂow

Exploring Flow Metrics in Dense Geographical Networks

“within each aggregate”, i.e., having both source and

target locations within any of the geographic clusters

in the current view (e.g. ﬂows within Europe, within

Americas, etc). The selected quantitative attribute is

the round-trip delay measured between pairs of loca-

tions. The three colors used in the other views (green,

yellow, red) identify three different classes of values

for each quantitative attribute, reported in the metric

scale that is visible just below the name of the at-

tribute. In the example in Fig. 1 green means “below

300ms”, yellow means “between 300ms and 550ms”,

and red means “above 550ms”. The right side of

the control panel contains additional buttons. The

user can choose between two different types of repre-

sentation for quantitative attributes: 1. averages, i.e.,

only one average value for each ﬂow between a pair

of geographic clusters; or 2. stacked values, i.e., the

weighted distribution of attribute values in the three

different classes detailed previously, identiﬁed with

the corresponding colors. The two threshold values in

the metric scale that give rise to the different classes

of values can be dynamically adjusted by clicking the

appropriate button in the control panel (requirements

R1, R4). Finally, a button allows the user to bring

the visualization back to the original state (before any

interaction).

The timeline contains a graph with aggregate in-

formation for each aggregate value in the time scale

chosen by the user. More speciﬁcally, three stacked

ribbons show the volume of ﬂow over time for each

of the three classes of metric values, as deﬁned in the

control panel. When the mouse pointer is placed over

the graph, additional information for the selected time

interval is shown on screen. The user can either drag

the slider or directly click the timeline to change the

time selection. The interface is updated accordingly

to show the corresponding data (requirement R3).

The dynamic matrix shows attribute classes for all

possible pairs of geographic clusters in the current

view, based on the selection on the timeline. More

speciﬁcally, it is a square matrix, where rows and

columns respectively represent sources and destina-

tions of ﬂows. Each cell in the matrix represents

the ﬂow between a pair of geographic clusters, with

size logarithmically proportional to the total volume

of ﬂow and colors reﬂecting the values of the quanti-

tative attribute in the current representation (require-

ment R1). The trapezoids on the left and bottom sides

of the matrix represent the set of visible geographic

clusters and all their ancestors. The cluster hierarchy

is explicitly represented by means of side contact be-

tween parents and children; see Fig. 3 for an exam-

ple where the hierarchy is expanded from the conti-

nent level to the country level (requirement R2). Such

technique is a simple variation of common algorithms

for tree representation (see, e.g., (Schulz, 2011) for a

detailed survey). All matrix elements are left inten-

tionally unlabeled, so that the user can focus on dis-

covering patterns and trends in the data by looking

exclusively at the color and size of each cell. Hov-

ering any element with the mouse reveals aggregate

information for the corresponding entity. In particu-

lar, trapezoids yield information about the geographic

clusters they represent, while cells in the matrix yield

information on the ﬂow between the two involved

clusters.

The dynamic map shows a circle for each geo-

graphic cluster in the current view, positioned at the

centroid of the corresponding locations on the map.

At any time, each of such circles shows the same vol-

ume and metric values as one of the cells in the ma-

trix. The dynamic map, therefore, only shows a linear

subset of the information contained in the dynamic

matrix. Such portion can change based on user in-

teraction. In the initial state circles in the map are in

correspondence with cells on the diagonal of the ma-

trix, i.e., each of them represents the ﬂows for which

source and destination are both within the correspond-

ing geographic cluster. The regions with dashed bor-

ders that enclose groups of circles represent the same

hierarchy of clusters that is pictured with trapezoids in

the dynamic matrix. The information panel at the bot-

tom right corner shows appropriate information de-

pending on user interaction. The user can also hover

individual circles and dashed regions to get the vol-

ume of ﬂow and quantitative information on the se-

lected attribute.

We designed our framework with a focus on in-

teractivity and responsiveness. The main operations

that the user can perform are the following: 1. ge-

ographic view change, i.e., updating the set of visi-

ble geographic clusters, by either exploring the chil-

dren of a geographic cluster or collapsing sibling clus-

ters into their parent cluster (requirement R2); 2. ge-

ographic cluster selection, i.e., narrowing the anal-

ysis to the ﬂow originating from a geographic clus-

ter, targeted at a geographic cluster, or between two

geographic clusters (requirements R2); 3. time selec-

tion, i.e., selecting the appropriate time aggregate on

the timeline to see corresponding values (requirement

R3); and 4. attribute class customization, i.e., tweak-

ing the two threshold values that determine the three

different classes of attributes identiﬁed by three col-

ors: green, yellow, and red (requirement R4). The ﬁrst

two operations can be achieved by interacting indif-

ferently with the dynamic matrix, the dynamic map,

or a combination of both. Upon user interaction, all

the components in the interface are automatically up-

IVAPP 2018 - International Conference on Information Visualization Theory and Applications

dated to reﬂect the new state of the visualization. The

reader can refer to Section 5 for a high-level descrip-

tion of use cases that prove how to exploit our system

in real-world scenarios.

3.2 Discussion of Design Choices

The interface has two main components, i.e., the dy-

namic matrix and the dynamic map. If the user is in-

terested in looking up ﬂows for speciﬁc geographic

locations, the dynamic map offers the quickest alter-

native. The search box can be used if the exact loca-

tion of a country is not known in advance. On the

other hand, the dynamic matrix is entirely focused

on the quantitative attributes, and its regular struc-

ture makes it possible to spot events affecting speciﬁc

sources or destinations of ﬂow.

Regarding requirements listed in Section 2, we ob-

serve that the interface of FLOWMATRIX addresses

them as follows. An overview of the quantitative at-

tributes in the data set (requirement R1) is offered

by the sizes and colors of cells in the matrix. The

geographic clustering (requirement R2) can be lever-

aged both in the matrix and in the map by means of

intuitive user interaction (i.e. double-click with the

mouse). Filtering the ﬂow based on source and des-

tination or switching between incoming and outgo-

ing ﬂow (requirement R1) can be achieved by click-

ing the appropriate geographic clusters (clicking on

an already selected cluster reverses the direction of

the ﬂow). The timeline allows for a quick selection

of time (requirement R3) by means of simple mouse

clicks, and offers an overview of the historical context

with the stacked ribbons. The performance indicators

can be easily customized by changing the thresholds

through the control panel (requirement R4).

Regarding the absolute scaling of the circles on

the map, we are aware of the potential perceptual

problems described in Flannery’s work (Flannery,

1971). However, we observe that the user tasks typ-

ically involve estimating the maximum and the mini-

mum circle area rather than the ratio between speciﬁc

pairs. Also, the contemporary presence of the corre-

sponding stacked bars in the matrix tends to compen-

sate for perception errors.

FLOWMATRIX makes use of animation in two dif-

ferent occasions: 1. in response to user interaction, the

smooth transition to the new state of the interface is

realized with a simple animation; and 2. to represent

the direction of the ﬂow on the map, concentric cir-

cles are pictured as either emanating from or collaps-

ing on the selected circle. We speciﬁcally refrained

from using animation to convey crucial information

(e.g. historical trends in the data set), following the

guidelines and caveats found in recent literature (see,

e.g., (Robertson et al., 2008; Di Donato et al., 2016)).

4 IMPLEMENTATION

The algorithmic background of FLOWMATRIX is

pretty straightforward. The drawing of circles in

the dynamic map is achieved with an iterative force-

directed graph layout algorithm (Di Battista et al.,

1999). Each circle is represented with two nodes con-

nected by one edge: the ﬁrst node is ﬁxed at the ideal

position for the center of the circle, i.e. the centroid of

the corresponding geographic cluster, while the sec-

ond is subject to forces and represents the actual po-

sition of the visualized circle. The edges connect-

ing such pairs of nodes are modeled as zero-length

springs. Since each node is attracted to its ideal po-

sition there is no need to introduce repulsive forces

between nodes. However, in order to avoid overlap

among visualized circles, a simple collision detection

heuristic is introduced that adjusts the relative posi-

tions of two nodes in case a potential overlap is de-

tected. The areas with dashed borders that enclose

circles in the dynamic map are obtained with a state-

of-the-art algorithm (Rappaport, 1992) for the com-

putation of convex hulls of circles. We use the same

algorithm to compute a clipping path for the represen-

tation of ﬂow “waves” between any pair of geographic

clusters (see, e.g., Fig. 2(b)).

FLOWMATRIX is a Web application implemented

in JavaScript, using the popular D3.js framework (Bo-

stock et al., 2011) for the development of highly in-

teractive data-driven visualizations. All the compo-

nents of the interface are coordinated following the

Publish-Subscribe pattern, so that any user interaction

is transformed into an event that triggers appropriate

updates in each component. We made a special effort

to improve the performance of the tool, limiting ani-

mations and redraws where possible. That is crucial

especially in the dynamic matrix, where the number

of cells grows quadratically upon expansion of geo-

graphic clusters.

5 USE CASES

In order to show the effectiveness of our approach, in

this section, by means of four use cases, we describe

how simple and complex tasks can be achieved by

taking advantage of the coordinated view of FLOW-

MATRIX.

Use Case 1: Finding the total volume of ﬂow and

the average round-trip delay from Italy to Russia on

Exploring Flow Metrics in Dense Geographical Networks

(a)

(b)

Figure 2: Sequence of user interactions needed to study the

ﬂow from Italy to Russia. (a) The dynamic map is zoomed

on Europe, where two sub-regions (Southern and Eastern

Europe) are expanded to reveal their respective countries.

The size of each circle is proportional to the volume of ﬂow

within the corresponding country. (b) The dynamic map is

focused on the ﬂow from Italy to Russia.

8 Sept 2013. First of all, we select the correct date

on the timeline and choose to visualize only average

values for quantitative attributes using the appropri-

ate button on the control panel. We then locate the

circle that represents Europe on the dynamic map and

double-click it. The interaction has the effect of up-

dating the geographic view by adding all the sub-

regions within Europe to the current representation.

The dynamic map and the dynamic matrix are up-

dated accordingly, making room for the new clusters.

We repeat the same step with the two circles repre-

senting Southern Europe and Eastern Europe, with

the effect of enriching the view with all associated

countries, including Russia and Italy. The state of the

map reached after the initial interaction is in Fig. 2(a).

Note that the circles representing European countries

lack the semi-transparent “glow” effect, meaning that

they cannot be further expanded to reveal more de-

tailed information (i.e., we reached leaves in the clus-

ter hierarchy). Note also that the smaller circles are

ﬁlled with a grid-like pattern, meaning that their ﬂow

volume would be too small for a proportional repre-

sentation on the map, and therefore their radius is ad-

justed to an acceptable minimum length. We click

the circle representing Italy, triggering the following

Figure 3: Dynamic matrix showing country pairs with aver-

age packet loss greater than 10% in red.

updates: 1. the row representing the ﬂow from Italy

in the dynamic matrix is highlighted; 2. the size and

color of each circle in the dynamic map represents

the ﬂow from Italy to the corresponding geographic

cluster, and the ﬂow itself is pictured with animated

concentric waves emanating from the clicked circle;

3. the stacked ribbons in the timeline show aggregate

data for the ﬂow from Italy to all other destinations.

We can achieve the same result by looking up the se-

lected country in the search box positioned at the up-

per right corner of the dynamic map (visible in Fig. 1).

Finally, we hold the Shift key and click the circle rep-

resenting Russia. New updates are triggered: 1. the

square in the dynamic matrix representing the ﬂow

from Italy to Russia is highlighted; 2. the ﬂow in the

dynamic map is represented with waves from Italy

to Russia, as shown in Fig. 2(b); 3. the stacked rib-

bons in the timeline show aggregate data for the ﬂow

from Italy to Russia. The requested information can

be found in the info panel on the dynamic map.

Use Case 2: Counting how many country pairs in

Europe have average packet loss greater than 10%

on 6 Sept 2013. This task shows the potential of

the dynamic matrix when analyzing a data set which

grows quadratically with respect to the number of ge-

ographic clusters. First of all we focus on the control

panel, choosing the right quantitative attribute and up-

dating the range of values such that ﬂows with packet

loss greater than 10% are identiﬁed with the red color.

We select the right date on the timeline and the ap-

propriate metric on the control panel. We then focus

on the dynamic map, ﬁrst double-clicking the circle

IVAPP 2018 - International Conference on Information Visualization Theory and Applications

(a)

(b)

Figure 4: Views showing details for the third use case. (a)

The ﬂow from Spain is pictured with concentric blue circles.

The size of each circle is proportional to the volume of ﬂow

from Spain to the corresponding country. (b) The stacked

graph in the timeline reaches its peak on 6 Sept 2013.

representing Europe and then all the circles represent-

ing sub-regions in Europe, until we reach the country

level. We can ﬁnally focus on the dynamic matrix

and simply count all the occurrences of red cells that

fall within the portion of matrix related to European

countries, as visible in Fig. 3. Although small, their

color is enough to get a quick overview and answer

the original question.

Use Case 3: Finding out what European country re-

ceives the highest volume of ﬂow from Spain on 5 Sept

2013 and which day sees the highest volume of ﬂow

between the same pair of countries. We select the

right date on the timeline. Since we focus on the vol-

ume, the visualized quantitative attribute is irrelevant.

We focus on the dynamic map to show circles for all

European countries and click the circle representing

Spain. Apart from Spain itself, the biggest circle in

Europe is the one representing United Kingdom, as

visible in Fig. 4(a). That is an effective and sufﬁ-

cient visual clue to understand that United Kingdom

receives the highest percentage of ﬂow from Spain.

To answer the second part of the question we hold the

Shift key, click the cluster representing United King-

dom, and focus on the timeline (see Fig. 4(b)). The

day with the highest volume of ﬂow from Spain to

United Kingdom is 6 Sept 2013.

Use Case 4: Studying how a ﬁrmware update re-

cently rolled out on the hardware equipment impacted

performance and reliability. Suppose that the de-

ployment involves hundreds of thousands of hardware

components all over the world, and is conducted in-

crementally in one week starting from Asia, contin-

uing with Europe, and ending with the entire globe.

In preparation for the analysis, we compile a data set

based on logs collected by the different components

of the infrastructure. The three numeric attributes

that are included in each data case are the follow-

ing: round-trip delay, packet loss, and the percentage,

called π, of traversed hardware components with the

most recent ﬁrmware version. The data is aggregated

on a daily basis and spans the entire week spent in the

deployment of the new ﬁrmware.

First of all, we select π as the metric to visualize

from the control panel, and tweaks the thresholds for

the value classes so that transmissions with π = 100%

are green, those with 50% ≤ π < 100% are yellow,

and the remaining are red.

During the ﬁrst two days in the selected week,

we observe the expected steep increase of green ﬂow

within Asia by clicking and Shift-clicking the related

circle on the map and looking at the colors on the

timeline. We also expect a corresponding increase of

yellow ﬂows between Asia and other continents, in

particular Europe and Oceania, given their geograph-

ical proximity that implies a higher percentage of tra-

versed hardware components residing in Asia. As-

sume that, after clicking the circle representing Asia

on the map, we ﬁnd out that the circle for Europe,

representing ﬂow from Asia to Europe, stays mostly

red during the two-day interval. To inspect the causes

of this, we move to the dynamic map and expand

both Asia and Europe to reveal the corresponding sub-

regions. Suppose the circle representing ﬂow within

Middle East stands out for its red color. This would

immediately prompt us to click it and see whether out-

going and incoming ﬂows are also mostly red. We

would conclude that the deployment in Middle East

was slower than expected, which probably also im-

pacted the ﬂows between Europe and Asia, given the

strategic location. A further expansion of the Mid-

dle East cluster on the map may conﬁrm that most of

those countries have red values for π.

We can proceed to explore data for the follow-

ing two days to conﬁrm that the ﬂows within Europe

and those between Europe and Asia progressively turn

green, while Europe’s incoming and outgoing ﬂows

turn yellow.

As a next step, we could shift our attention to

the measured performance of ﬂows. We ﬁrst change

the visualized attribute to round-trip delay, focus on

the same geographic clusters and ﬂows mentioned

above, to observe whether any suspect variation of

performance values occurred during the deployment

phase. Suppose we observe, when switching to the

packet loss attribute, a noticeable degradation of per-

formance (i.e. an increase of average values) that fol-

lows the same temporal and geographic patterns of

Exploring Flow Metrics in Dense Geographical Networks

the deployment studied before. We would therefore

be induced to conjecture a suspected correlation and

look for unnoticed bugs in the latest ﬁrmware update.

Finally, we can verify if the deployment is com-

plete at the end of the week by selecting the π at-

tribute, clicking the last date on the timeline and fo-

cusing on the dynamic matrix. Suppose two cells in

the matrix are still partially red and that they corre-

spond to the ﬂows in both directions between Africa

and the Americas. We can expand both clusters on

the map and discover, for example, that the problem

can be narrowed to the ﬂows between Northern Africa

and the United States, concluding that there is a spe-

ciﬁc portion of the infrastructure, involving only the

two areas, for which the deployment is not yet com-

plete.

6 EVALUATION

FLOWMATRIX was conceived with the goal of allow-

ing users to get quick insights on data sets detailing

ﬂows between geographic regions. Since any inter-

action with the tool can be decomposed into recur-

ring tasks, it becomes crucial to verify that these can

be correctly and quickly accomplished by prospective

users. This section presents the results of the evalua-

tion study we conducted after implementing our pro-

totype.

We initially thought of conducting a comparative

study, where participants would need to solve a list

of tasks both with our framework and with standard

tools (e.g. database queries). However, we quickly

discarded this option because even the expert users we

interacted with did not have experience with a stan-

dard, uniﬁed set of tools for the purpose of accessing

and analyzing the same data set. Therefore any com-

parison would have suffered from potential bias, de-

pending on the relative experience of the participant.

We opted instead for a qualitative study, where par-

ticipants were given a set of tasks and feedback was

collected at the end of each task.

6.1 Study Design

In preparation for the study we fed our prototype with

a precomputed data set, structured like the one used

for the ﬁgures in Section 3 and containing data for

four days between 5 Sept 2013 and 8 Sept 2013. The

study was conducted with ten participants (nine male,

one female) all aged between 25 and 35 years old.

They are all domain experts with a background in

computer science, statistics, telecommunications, or

electronics. At the time of the study, they were al-

ready familiar with the data collection from which we

derived the data set used as input for the evaluation.

More than half of the participants had worked with the

same data collection before, accessing its content by

means of database queries or simple time series plots.

Each participant was initially tested for color

blindness. A thorough introduction to the framework

followed, with a focus on each of the views and all the

available user interaction. A couple of example tasks

were illustrated step by step. After that, each partici-

pant was asked to solve 15 tasks. The ﬁrst four were

treated as training tasks, i.e., the participant had the

possibility to ask for help. For each task the examiner

recorded the completion time with a stopwatch, gath-

ered feedback at the end of the execution, and showed

a quicker way to achieve the same result in case the

strategy adopted by the participant was clearly sub-

optimal. General feedback was asked from each par-

ticipant as a ﬁnal step after the last task. The aver-

age time required by each participant to complete the

study was 50 minutes.

6.2 Results and Discussion

All the participants successfully completed the pro-

posed tasks, adopting different strategies based on the

context. The statistics on task completion times are

reported in Table 1. It is evident that users quickly

learned from mistakes done in previous tasks. For ex-

ample, 60% of participants had an instinctive prefer-

ence for the dynamic matrix when solving Task #13,

i.e., they updated the set of expanded geographic clus-

ters and compared the size of different squares with-

out using the dynamic map. After being shown a

faster solution that makes a better use of the dynamic

map, they quickly changed their strategy and per-

formed much better with Task #14 and Task #15. This

is conﬁrmed by the relatively small standard deviation

for the completion time of both tasks, which suggests

that users knew precisely what steps where needed to

complete them. Note also how the median comple-

tion time is smaller than the average time for most of

the tasks, which suggests that the outliers can be in-

terpreted as occasional difﬁculties or distractions ex-

perienced by individual users.

The feedback was overall very positive and enthu-

siastic. All participants were particularly impressed

by the possibility to ﬁnally “see” the data they had

only been able to access with database queries and

simple two-dimensional charts. They also appreciated

the power of exploring data both on the dynamic map

and the dynamic matrix at the same time, depending

on the speciﬁc task. Many important suggestions for

IVAPP 2018 - International Conference on Information Visualization Theory and Applications

Table 1: List of tasks and results of our qualitative study. For each task the average (avg), median (med) and standard deviation

(stdev) values for completion times are listed.

# Task

Time (s)

avg med stdev

1 Find volume of ﬂow from Africa to Europe on 7 Sept 2013 17.2 15.5 9.13

2 Enumerate regions receiving ﬂow from Africa with average round-trip delay greater

than 550ms on 6 Sept 2013

42.9 41.5 17.06

3 Enumerate pairs of regions that have more than 50% of ﬂow with round-trip delay

greater than 700ms on 6 Sept 2013

90.3 96.5 37.29

4 Enumerate continents that receive ﬂow from Belgium on 5 Sept 2013 with average

packet loss smaller than 1.2%

82.2 72.5 27.52

5 Find volume of ﬂow from Portugal to Spain on 5 Sept 2013 34.2 30 13.25

6 Find average round-trip delay from Portugal to Spain on 7 Sept 2013 39 33.5 20.66

7 Find day with highest volume of ﬂow from Portugal to Spain 27 21 15.23

8 Find which region receiving ﬂow from Americas has highest percentage of ﬂow with

packet loss higher than 2% on 5 Sept 2013

54 52 12.21

9 Find pairs of regions with average round-trip delay greater than 700ms on 5 Sept 2013 35.9 34 13.31

10 Find pairs of regions with more than 50% of ﬂow with round-trip delay greater than

700ms on 5 Sept 2013

26 25.5 7.94

11 Find days in which the average round-trip delay within Italy is greater than 300ms 49.7 45 14.07

12 Find days in which the average round-trip delay from Italy to Russia is between 320ms

and 330ms

58.6 52.5 16.63

13 Find the European country receiving the highest volume of ﬂow from Spain on 6 Sept

2013

124.5 112 45.06

14 Find the European country receiving the highest volume of ﬂow from Northern Africa

on 5 Sept 2013

46.7 50 8.54

15 Find how many European country pairs have average round-trip time greater than 1s

on 5 Sept 2013

59.6 57 14.13

improvement were collected during the study. 80% of

participants found the timeline to be not enough intu-

itive to compare the volume of ﬂow between different

days, and suggested to add an explicit indication of

the total volume of ﬂow. 50% would have appreciated

the possibility to expand geographic clusters straight

to the ﬁnest level of aggregation, without intermedi-

ate levels, i.e., sub-regions. 50% had trouble to come

up with the right sequence of interactions to highlight

the ﬂow within a speciﬁc geographic clusters on the

dynamic map (i.e., click followed by Shift-click on

the same circle). 50% initially overlooked smaller

squares in the dynamic matrix, while 20% suggested

to add a “full-screen” capability to the dynamic map

and the dynamic matrix as a solution. 50% spent a

non negligible amount of time wondering where to

ﬁnd the actual answer for some tasks, after complet-

ing all the right interactions. 40% suggested to add

a smarter search box to programmatically specify a

query in the form “ﬂow from A to B”. 40% com-

plained that the size of circles in the dynamic map is

not a sufﬁcient clue to estimate the volume, although

they succeeded in their tasks after comparing the ac-

tual volumes for the bigger circles (in the range of

2 to 4). Further minor observations were related to

the speciﬁc data set (e.g. 40% were not sure whether

Russia was to be found under Europe or Asia) and to

the lack of experience with the interface (e.g. 70% of

users needed some time before appreciating the dis-

tinction between average and stacked metric values).

7 RELATED WORK

The solution we propose to explore ﬂows is a coordi-

nated multiple view featuring both a map and a ma-

trix. Hence, we survey the related research areas and

compare with the solutions proposed for similar prob-

lems.

7.1 Thematic Maps

Visualization of abstract data on maps, is a traditional

topic in cartography, where choropleth maps, propor-

tional symbols maps, dotted maps, etc are used to

visualize the distribution of statistical variables. Al-

though the non-geographic data represented in such

thematic maps is usually very simple, in some cases

it may have a more complex structure. Wood et

al. (Wood et al., 2010) divide the geographical space

with a grid and draw in each cell a replica of the orig-

inal map that shows inbound ﬂow from all the other

regions. Their approach is further expanded in (Wood

et al., 2011), where replicas show approximate ﬂow

Exploring Flow Metrics in Dense Geographical Networks

patterns by means of time series plots. Enriching the-

matic maps with small multiples, however, can lead to

cluttered views when the input data set grows in size.

7.2 Visualizing Flows and Movement

There is a rich scientiﬁc literature about ﬂows in maps

where the trajectory of bodies plays a crucial role. A

limited list of applications include the visualization of

vessel trafﬁc (see, e.g., (Willems et al., 2009; Scheep-

ens et al., 2011; Scheepens et al., 2012)) and the vi-

sualization of aircraft routes (see, e.g., (Hurter et al.,

2009; Bottger et al., 2014)). In our data set, however,

information regarding the trajectories is missing.

Buchin et al. (Buchin et al., 2011) build ﬂow maps

using spiral trees to induce a clustering on the tar-

gets and smoothly bundle lines. In a more recent

work (Nocaj and Brandes, 2013) similar maps are

obtained with a new edge bundling technique that

avoids ambiguous connections between pairs of ver-

tices. Both techniques are visually compelling when

describing the ﬂow from a single source to many tar-

gets, but are not adequate for dense graphs.

Andrienko et al. (Andrienko et al., 2008) present

a taxonomy of the possible approaches available for

the geovisualization of dynamics, movement, and

change. They identify three alternatives: 1. direct de-

piction of data, which can easily lead to clutter and

slow rendering; 2. use of summaries like aggrega-

tion, generalization and sampling; and 3. use of sta-

tistical methods to extract patterns before visualizing

them. Our approach follows the second alternative,

hence addressing requirement R2. Guo (Guo, 2009)

proposes an interface to render large spatial interac-

tion data, consisting of multiple views: a geographical

map with arrows representing ﬂow between regions,

a self-organizing map, and a parallel coordinate plot.

The tool is based on a precomputed hierarchical re-

gionalization based on the volume of ﬂow between

pairs of regions. Although reasonable, such region-

alization does not support the type of clustering im-

posed by requirement R2.

7.3 Matrix-based Coordinated Views

Elmqvist et al. (Elmqvist et al., 2008) present a ma-

trix visualization technique that features fast reorder-

ing of rows and columns, data aggregation with ex-

plicit representation, and GPU acceleration to opti-

mize the rendering on screen. Their approach inspired

part of our work while constructing a dynamic matrix

optimized for pattern recognition. In particular, while

the matrix representation has been effectively coupled

with a node-link representation in (Henry and Fekete,

2006; Henry et al., 2007), we couple it with a geo-

graphic visualization to represent ﬂows among geolo-

cated sites, where the matrix provides an aggregatable

view of all-pairs relationships while the geographic

map shows the sources and destinations of the ﬂows

the user is interested in.

A visualization problem very similar to the one

described in this paper is addressed by Boyandin

et al. (Boyandin et al., 2011), who propose FLOW-

STRATES, a visualization framework in which the ori-

gins and the destinations of the ﬂows are displayed

on two separate maps, and the changes of ﬂow mag-

nitudes over time are represented in a matrix-like

heatmap in the middle. Hence, although FLOW-

STRATES also uses a matrix-based coordinated view,

it devotes the expressiveness of the matrix to the tem-

poral evolution of the ﬂows, using one column for

each period of time. Each row of FLOWSTRATES ma-

trix represents the evolution over time of a speciﬁc

ﬂow, the color of each cell being proportional to the

amount of ﬂow. Flows that are (on average) bigger

than others are placed on the top rows. In order to

identify the source and destination of each ﬂow, two

leaders, one exiting the leftmost cell and the other ex-

iting the rightmost one, point to the source and the

destination locations on the maps, respectively.

With respect to our visualization tool, FLOW-

STRATES addresses a simpler visualization problem

where ﬂows do not have performance metrics asso-

ciated with them. Hence, requirements R1 and R4

cannot be met by the proposed techniques. Addition-

ally, FLOWSTRATES’ matrix does not support aggre-

gation or clustering of rows and an overall view of

the whole data set is not possible (requirement R2).

Instead, FLOWSTRATES allows the user to aggregate

sources and destinations on the maps with a lazo se-

lection. Although this proves to be a ﬂexible tool,

well-known geographic aggregations are not immedi-

ate to select and at most one aggregation at a time is

allowed in each map.

8 CONCLUSION

We have presented a framework for the interactive

exploration of the ﬂow between pairs of geographic

locations. It allows researchers, engineers and man-

agers to quickly assess the nature and evolution of

ﬂows between pairs of geographical locations at vari-

ous levels of detail, while keeping an eye on the gen-

eral picture.

In the future we will extend the set of features

of our prototype, overcoming its current limitations.

First of all we will follow the suggestions that came

IVAPP 2018 - International Conference on Information Visualization Theory and Applications

out of the qualitative study presented in Section 6.

Further, we will extend the representation of met-

rics, including the display of non-quantitative metrics,

the explicit rendering of the distribution of values for

each metric, and the possibility to ﬁlter speciﬁc value

ranges for a cleaner visualization. The user will have

the possibility to pick pairs of dates on the timeline,

in order to compare related metric values looking for

potential drops or improvements in performance.

REFERENCES

Andrienko, G., Andrienko, N., Dykes, J., Fabrikant, S. I.,

and Wachowicz, M. (2008). Geovisualization of dy-

namics, movement and change: Key issues and devel-

oping approaches in visualization research. Informa-

tion Visualization, 7(3):173–180.

Bostock, M., Ogievetsky, V., and Heer, J. (2011). D3:

Data-driven documents. IEEE Trans. Visualization &

Comp. Graphics (Proc. InfoVis).

Bottger, J., Schafer, A., Lohmann, G., Villringer, A., and

Margulies, D. S. (2014). Three-dimensional mean-

shift edge bundling for the visualization of functional

connectivity in the brain. Visualization and Computer

Graphics, IEEE Transactions on, 20(3):471–480.

Boyandin, I., Bertini, E., Bak, P., and Lalanne, D. (2011).

Flowstrates: An approach for visual exploration of

temporal origin-destination data. In Proc. of the 13th

Eurographics / IEEE - VGTC Conference on Visu-

alization, EuroVis’11, pages 971–980. Eurographics

Association.

Buchin, K., Speckmann, B., and Verbeek, K. (2011). Flow

map layout via spiral trees. IEEE Transactions on

Visualization and Computer Graphics, 17(12):2536–

2544.

Di Battista, G., Eades, P., Tamassia, R., and Tollis, I. G.

(1999). Graph Drawing. Prentice Hall, Upper Saddle

River, NJ.

Di Donato, V., Patrignani, M., and Squarcella, C. (2016).

Netfork: Mapping time to space in network visualiza-

tion. In Buono, P., Lanzilotti, R., and Matera, M., ed-

itors, International Working Conference on Advanced

User Interfaces (AVI 2016), pages 92–99.

Elmqvist, N., Do, T.-N., Goodell, H., Henry, N., and Fekete,

J. (2008). Zame: Interactive large-scale graph visual-

ization. In Visualization Symposium, 2008. PaciﬁcVIS

’08. IEEE Paciﬁc, pages 215–222.

Flannery, J. J. (1971). The relative effectiveness of some

common graduated point symbols in the presentation

of quantitative data. The Canadian Cartographer,

8:96–109.

Guo, D. (2009). Flow mapping and multivariate visu-

alization of large spatial interaction data. IEEE

Transactions on Visualization and Computer Graph-

ics, 15(6):1041–1048.

Henry, N. and Fekete, J. (2006). MatrixExplorer: a dual-

representation system to explore social networks. Vi-

sualization and Computer Graphics, IEEE Transac-

tions on, 12(5):677–684.

Henry, N., Fekete, J.-D., and McGufﬁn, M. J. (2007). Node-

Trix: A hybrid visualization of social networks. IEEE

Transactions on Visualization and Computer Graph-

ics, 13(6):1302–1309.

Hurter, C., Tissoires, B., and Conversy, S. (2009). From-

dady: Spreading aircraft trajectories across views to

support iterative queries. IEEE Transactions on Visu-

alization and Computer Graphics, 15(6):1017–1024.

Nocaj, A. and Brandes, U. (2013). Stub bundling and con-

ﬂuent spirals for geographic networks. In Wismath, S.

and Wolff, A., editors, Graph Drawing, volume 8242

of Lecture Notes in Computer Science, pages 388–

399. Springer International Publishing.

Rappaport, D. (1992). A convex hull algorithm for discs,

and applications. Computational Geometry, 1(3):171

– 187.

Robertson, G., Fernandez, R., Fisher, D., Lee, B., and

Stasko, J. (2008). Effectiveness of animation in trend

visualization. Visualization and Computer Graphics,

IEEE Transactions on, 14(6):1325–1332.

Scheepens, R., Willems, N., van de Wetering, H., and van

Wijk, J. (2011). Interactive visualization of multivari-

ate trajectory data with density maps. In Paciﬁc Visu-

alization Symposium (PaciﬁcVis), 2011 IEEE, pages

147–154.

Scheepens, R., Willems, N., van de Wetering, H., and van

Wijk, J. (2012). Visualization of vessel trafﬁc. In Po-

seidon: Situational Awareness with Systems of Sys-

tems.

Schulz, H. (2011). Treevis.net: A tree visualization ref-

erence. Computer Graphics and Applications, IEEE,

31(6):11–15.

Willems, N., Van De Wetering, H., and Van Wijk, J. J.

(2009). Visualization of vessel movements. Computer

Graphics Forum, 28(3):959–966.

Wood, J., Dykes, J., and Slingsby, A. (2010). Visualisation

of origins, destinations and ﬂows with od maps. The

Cartographic Journal, 47(2):117–129.

Wood, J., Slingsby, A., and Dykes, J. (2011). Visualizing

the dynamics of london’s bicycle-hire scheme. Carto-

graphica, 46(4):239–251.

Exploring Flow Metrics in Dense Geographical Networks