Visual-Interactive Similarity Search for Complex Objects

by Example of Soccer Player Analysis

urgen Bernard

, Christian Ritter

, David Sessler

, Matthias Zeppelzauer

, J

orn Kohlhammer

1,3

and Dieter Fellner

1,3

Technische Universit

at Darmstadt, Darmstadt, Germany

St. P

olten University of Applied Sciences, St. P

olten, Austria

Fraunhofer Institute for Computer Graphics Research, IGD, Darmstadt, Germany

Keywords:

Information Visualization, Visual Analytics, Active Learning, Similarity Search, Similarity Learning,

Distance Measures, Feature Selection, Complex Data Objects, Soccer Player Analysis, Information Retrieval.

Abstract:

The deﬁnition of similarity is a key prerequisite when analyzing complex data types in data mining, informa-

tion retrieval, or machine learning. However, the meaningful deﬁnition is often hampered by the complexity

of data objects and particularly by different notions of subjective similarity latent in targeted user groups. Tak-

ing the example of soccer players, we present a visual-interactive system that learns users’ mental models of

similarity. In a visual-interactive interface, users are able to label pairs of soccer players with respect to their

subjective notion of similarity. Our proposed similarity model automatically learns the respective concept of

similarity using an active learning strategy. A visual-interactive retrieval technique is provided to validate the

model and to execute downstream retrieval tasks for soccer player analysis. The applicability of the approach

is demonstrated in different evaluation strategies, including usage scenarions and cross-validation tests.

1 INTRODUCTION

The way how similarity of data objects is deﬁned and

represented in an analytical system has a decisive in-

ﬂuence on the results of the algorithmic workﬂow for

downstream data analysis. From an algorithmic per-

spective the notion of object similarity is often imple-

mented with distance measures resembling an inverse

relation to similarity.

Many data mining approaches necessarily require

the deﬁnition of distance measures, e.g., for conduct-

ing clustering or dimension reduction. In the same

way, most information retrieval algorithms carry out

indexing and retrieval tasks based on distance mea-

sures. Finally, the performance of many supervised

and unsupervised machine learning methods depends

on meaningful deﬁnitions of object similarity. The

classical approach for the deﬁnition of object simi-

larity includes the identiﬁcation, extraction, and se-

lection of relevant attributes (features), as well as the

deﬁnition of a distance measure and optionally a map-

ping from distance to similarity. Furthermore, many

real-world examples require additional steps in the al-

gorithmic pipeline such as data cleansing or normal-

ization. In practice, quality measures such as preci-

sion and recall are used to assess the quality of the

similarity models and the classiﬁers built upon them.

In this work, we strengthen the connection be-

tween the notion of similarity of individual users and

its adoption to the algorithmic deﬁnition of object

similarity. Taking the example of soccer players from

European soccer leagues, a manager may want to

identify previously unknown soccer players match-

ing a reference player, e.g., to occupy an important

position in the team lineup. This is contrasted by a

national coach who is also interested in selecting a

good team. However, the national coach is indepen-

dent from transfer fees and salaries while his choice

is limited to players of the respective nationality. The

example sheds light on various remaining problems

that many classical approaches are confronted with.

First, in many approaches designers do not know be-

forehand which deﬁnition of object similarity is most

meaningful. Second, many real-world approaches re-

quire multiple deﬁnitions of similarity for being us-

able for different users or user groups. Moreover, it

is not even determined that the notion of similarity

of single users remains constant. Third, the example

Bernard J., Ritter C., Sessler D., Zeppelzauer M., Kohlhammer J. and Fellner D.

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis.

DOI: 10.5220/0006116400750087

In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), pages 75-87

ISBN: 978-989-758-228-8

 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reser ved

Figure 1: Overview of the visual-interactive tool. Left: users are enabled to label the similarity between two soccer players

(here: Radja Nainggolan and Marouane Fellaini, both from Belgium). The user’s notion of similarity is propagated to the

similarity learning model. Right: a visual search interface shows the model results (query: Dimitri Payet). The example

resembles the similarity notion typically for a national trainer: only players from the same country have very high similarity

scores. Subsequently, the nearest neighbors for Dimitri Payet are all coming from France and have a similar ﬁeld position.

of soccer players implicitly indicates that deﬁnition

of similarity becomes considerably more difﬁcult for

high-dimensional data. Finally, many real-word ob-

jects consist of mixed data, i.e. attributes of numeri-

cal, categorical, and binary type. However, most cur-

rent approaches for similarity measurement are lim-

ited to numerical attributes.

We hypothesize that it is desirable to shift the def-

inition of object similarity from an ofﬂine preprocess-

ing routine to an integral part of future analysis sys-

tems. In this way the individual notions of similarity

of different users will be reﬂected more comprehen-

sively. The precondition for the effectiveness of such

an approach is a means that enables users to commu-

nicate their notion of similarity to the system. Logi-

cally, such a system requires the functionality to grasp

and adopt the notion of similarity communicated by

the user. Provided that users are able to conduct var-

ious data analysis tasks relying on object similarity

in a more dynamic and individual manner. This re-

quirement shifts the deﬁnition of similarity towards

active learning approaches. Active learning is a re-

search ﬁeld in the area of semi-supervised learning

where machine learning models are trained with as

few user feedback as possible, learning models that

are as generalizable as possible. Beyond classical ac-

tive learning, the research direction of this approach

is towards visual-interactive learning allowing users

to give feedback for those objects they have precise

knowledge about.

We present a visual-interactive learning system

that learns the similarity of complex data objects on

the basis of user feedback. The use case of soccer

players will serve as a relevant and intuitive example.

Overall this paper makes three primary contributions.

First, we present a visual-interactive interface that en-

ables users to select two soccer players and to submit

feedback regarding their subjective similarity. The set

of labeled pairs of players is depicted in a history vi-

sualization for lookup and reuse. Second, a machine

learning model accepts the pairwise notions of simi-

larity and learns a similarity model for the entire data

set. An active learning model identiﬁes player ob-

jects where user feedback would be most beneﬁcial

for the generalization of the learned model, and prop-

agates them to a visual-interactive interface. Third,

we present a visual-interactive retrieval interface en-

abling users to directly submit example soccer players

to query for nearest neighbors. The interface com-

bines both validation support as well as a downstream

application of model results. The results of differ-

ent types of evaluation techniques particularly assess

the efﬁciency of the approach. In many cases it takes

only ﬁve labeled pairs of players to learn a robust and

meaningful model.

The remaining paper is organized as follows. Sec-

tion 2 shows related work. We present our approach

in Section 3. The evaluation results are described in

Section 4, followed by a discussion in Section 5 and

the conclusion in Section 6.

IVAPP 2017 - International Conference on Information Visualization Theory and Applications

2 RELATED WORK

The contributions of this work are based on two core

building blocks, i.e., visual-interactive interfaces (in-

formation visualization, visual analytics) and algo-

rithmic similarity modeling (metric learning). We

provide a subsection of related work for both ﬁelds.

2.1 Visual-Interactive Instance Labeling

We focus on visual-interactive interfaces allowing

users to submit feedback about the underlying data

collection. In the terminology of the related work, a

data element is often referred to as an instance, the

feedback for an instance is called a label. Different

types of labels can be gathered to create some sort

of learning model. Before we survey existing ap-

proaches dealing with similarity in detail, we outline

inspiring techniques supporting other types of labels.

Some techniques for learning similarity metrics

are based on rules. The approach of (Fogarty et al.,

2008) allows users to create rules for ranking im-

ages based on their visual characteristics. The rules

are then used to improve a distance metric for image

retrieval and categorization. Another class of inter-

faces facilitates techniques related to interestingness

or relevance feedback strategies, e.g., to improve re-

trieval performance (Salton and Buckley, 1997). One

popular application ﬁeld is evaluation, e.g,. to ask

users which of a set of image candidates is best,

with respect to a pre-deﬁned quality criterion (Weber

et al., 2016). In the visual analytics domain, relevance

feedback and interestingness-based labeling has been

applied to learn users’ notions of interest, e.g., to

improve the data analysis process. Behrisch et al.

(Behrisch et al., 2014) present a technique for speci-

fying features of interest in multiple views of multidi-

mensional data. With the user distinguishing relevant

from irrelevant views, the system deduces the pre-

ferred views for further exploration. Seebacher et al.

(Seebacher et al., 2016) apply a relevance feedback

technique in the domain of patent retrieval, supporting

user-based labeling of relevance scores. Similar to our

approach, the authors visualize the weight of different

modalities (attributes/features). The weights are sub-

ject to change with respect to the iterative nature of

the learning approach. In the visual-interactive image

retrieval domain the Pixolution Web interface

com-

bines tag-based and example-based queries to adopt

users’ notions of interestingness. Recently the no-

tion of interestingness was adopted to prostate cancer

research. A visual-interactive user interface enables

Pixolution, http://demo.pixolution.de, last accessed on

September 22th, 2016

physicians to give feedback about the well-being sta-

tus of patients (Bernard et al., 2015b). The underly-

ing active-learning approach calculates the numerical

learning function by means of a regression tree.

Classiﬁcation tasks require categorical labels for

the available instances. Ware et al. (Ware et al.,

2001) present a visual interface enabling users to

build classiﬁers in a visual-interactive way. The ap-

proach works well for few and well-known attributes,

but requires labeled data sets for learning classiﬁers.

Seifert and Granitzer’s (Seifert and Granitzer, 2010)

approach outlines user-based selection and labeling

of instances as meaningful extension of classical ac-

tive learning strategies (Settles, 2009). The authors

point towards the potential of combining active learn-

ing strategies with information visualization which

we adopt for both the representation of instances and

learned model results. H

oferlin et al. (H

oferlin

et al., 2012) deﬁne interactive learning as an exten-

sion, which includes the direct manipulation of the

classiﬁer and the selection of instances. The applica-

tion focus is on building ad-hoc classiﬁers for visual

video analytics. Heimerl at al. (Heimerl et al., 2012)

propose an active learning approach for learning clas-

siﬁers for text documents. Their approach includes

three user interfaces: basic active learning, visualiza-

tion of instances along the classiﬁer boundary, and in-

teractive instance selection. Similar to our approach

the classiﬁcation-based visual analytics tool by Janet-

zko et al. (Janetzko et al., 2014) also applies to the

soccer application domain. In contrast to our appli-

cation goal, the approach supports building classiﬁers

for interesting events in soccer games by taking user-

deﬁned training data into consideration.

User-deﬁned labels for relevance feedback, inter-

estingness, or class assignment share the idea to bind

a single label to an instance, reﬂecting the classi-

cal machine learning approach ( f (i) = y). However,

functions for learning the concept of similarity require

a label representing the relation of pairs or groups

of instances, e.g., in our case, f (i

, i

) = y, where

y represents a similarity score in this case. Visual-

interactive user interfaces supporting such learning

functions have to deal with this additional complex-

ity. A workaround strategy often applied for the vali-

dation of information retrieval results shows multiple

candidates and asks the user for the most similar in-

stances with respect to a given query. We neglect this

approach since our users do not necessarily have the

knowledge to give feedback for any query instance

suggested by the system. Rather, we follow a user-

centered strategy where users themselves have an in-

ﬂuence on the selection of pairs of instances.

Another way to avoid complex learning functions

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis

is allowing users to explicitly assign weights to the

attributes of the data set (Ware et al., 2001; Jeong

et al., 2009). The drawback of this strategy is the ne-

cessity of users knowing the attribute space in its en-

tirety. Especially when sophisticated descriptors are

applied for the extraction of features (e.g., Fourier co-

efﬁcients) or deep learned features, explicit weighting

of individual features is inconceivable. Rather, our

approach applies an implicit attribute learning strat-

egy. While the similarity model indeed uses weighted

attributes for calculating distances between instances

(see Section 2.2), an algorithmic model derives at-

tribute weights based on the user feedback at object-

level. We conclude with a visual-interactive feedback

interface where users are enabled to align small sets of

instances on a two-dimensional arrangement (Bernard

et al., 2014). The relative pairwise distances between

the instances are then used by the similarity model.

We neglect strategies for arranging small sets of more

than two instances in 2D since we explicitly want to

include categorical and boolean attributes. It has been

shown that the interpretation of relative distances for

categorical data is non-trivial (Sessler et al., 2014).

2.2 Similarity Modeling

Aside from methods that employ visual interactive in-

terfaces for learning the similarity between objects

from user input as presented in the previous sec-

tion, methods for the autonomous learning of sim-

ilarity relations have been introduced (Kulis, 2012;

Bellet et al., 2013). Human similarity perception is

a psychologically complex process which is difﬁcult

to formalize and model mathematically. It has been

shown previously that the human perception of simi-

larity does not follow the rules of mathematical met-

rics, such as identity, symmetry and transitivity (Tver-

sky, 1977). Nevertheless, today most approaches em-

ploy distance metrics to approximate similarity esti-

mates between two items (e.g., objects, images, etc.).

Common distance metrics are Euclidean distance (L2

distance) and Manhattan distance (L1 distance) (Yu

et al., 2008), as well as warping or edit distance met-

rics. The edit distance was, e.g., applied to the soccer

domain in a search system where users can sketch tra-

jectories of player movement (Shao et al., 2016).

To better take human perception into account and

to better adapt the distance metric to the underly-

ing data and task an increasingly popular approach

is to learn similarity or distance measures from data

(metric learning). For this purpose different strategies

have been developed.

In linear metric learning the general idea is to

adapt a given distance function (e.g., a Mahalanobis-

like distance function) to the given task by esti-

mating its parameters from underlying training data

The learning is usually performed in a supervised or

weakly-supervised fashion by providing ground truth

in the form of (i) examples of similar and dissimi-

lar items (positive and negative examples), (ii) con-

tinuous similarity assessments for pairs of items (e.g.,

provided by a human) and (iii) triplets of items with

relative constraints, such as A is more similar to B

and C (Bellet et al., 2013; Xing et al., 2003). During

training the goal is to ﬁnd parameters of the selected

metric that maximizes the agreement between the dis-

tance estimates and the ground truth, i.e., by minimiz-

ing a loss function that measures the differences to the

ground truth. The learned metric can then be used to

better cluster the data or to improve the classiﬁcation

performance in supervised learning.

Similar to these approaches, we also apply a lin-

ear model. Instead of learning the distance metric di-

rectly, we estimate the Pearson correlation between

the attributes and the provided similarity assessments.

In this way, the approach is applicable even to small

sets of labeled pairs of instances. The weights ex-

plicitly model the importance of each attribute and,

as a by-product, enable the selection of the most im-

portant features for downstream approaches. To fa-

cilitate the full potential, we apply weighted distance

measures for internal similarity calculations, includ-

ing measures for categorical (Boriah et al., 2008) and

boolean (Cha et al., 2005) attributes.

In non-linear metric learning, one approach is to

learn similarity (kernels) directly without explicitly

selecting a distance metric. The advantage of kernel-

based approaches is that non-linear distance relation-

ships can be modeled more easily. For this purpose

the data is ﬁrst transformed by a non-linear kernel

function. Subsequently, non-linear distance estimates

can be realized by applying linear distance measure-

ments in the transformed non-linear space (Abbas-

nejad et al., 2012; Torresani and Lee, 2006). Other

authors propose multiple kernel learning, which is a

parametric approach that tries to learn a combination

of predeﬁned base kernels for a given task (G

onen

and Alpaydın, 2011). Another group of non-linear ap-

proaches employs neural networks to learn a similar-

ity function (Norouzi et al., 2012; Salakhutdinov and

Hinton, 2007). This approach has gained increasing

importance due to the recent success of deep learn-

ing architectures (Chopra et al., 2005; Zagoruyko and

Komodakis, 2015; Bell and Bala, 2015). The major

drawback of these methods is that they require huge

amounts of labeled instances for training which is not

available in our case.

The above methods have in common that the

IVAPP 2017 - International Conference on Information Visualization Theory and Applications

Table 1: Overview of primary data attributes about soccer players retrieved from DBpedia with SPARQL.

Attribute Description Variable Type Value Domain Quality

Issues

Name Name of the soccer player, unique identiﬁer Nominal (String) Alphabet of names perfect

Description Abstract of a player - for tooltips Nominal (String) Full text good

Nationality Nation of the player Nominal (String) 103 countries perfect

National Team National team, if applicable. Can be a youth team. Nominal (String) 155 nat. teams sparse

Birthday Day of birth (dd.mm.yyyy) Date perfect

Birthplace Place of birth Nominal (String) Alphabet of cities good

Size Size of the player in meters Numerical [1.59, 2.03] good

Current Team Team of the player (end of last season) Nominal (String) Alphabet of teams perfect

Main Position Main position on the ﬁeld Nominal (String) 13 positions perfect

Other positions

Other positions on the ﬁeld Nominal (String) 13 positions sparse, list

League Games No. of games played in the current soccer league Numerical [0, 591] sparse

League Goals No. of goals scored in the current soccer league Numerical [0, 289] sparse

Nat. Games No. of games played for the current nat. team Numerical [0, 150] sparse

Nat. Goals No. of goals scored for the current nat. team Numerical [0, 71] sparse

learned metric is applied globally to all instances in

the dataset. An alternative approach is local metric

learning that learns speciﬁc similarity measures for

subsets of the data or even separate measures for each

data item (Frome et al., 2007; Weinberger and Saul,

2009; Noh et al., 2010). Such approaches have advan-

tages especially when the underlying data has hetero-

geneous characteristics. A related approach are per-

exemplar classiﬁers which even allow to select dif-

ferent features (attributes) and distance measures for

each item. Per-exemplar classiﬁcation has been ap-

plied successfully for different tasks in computer vi-

sion (Malisiewicz et al., 2011). While our proposed

approach to similarity modeling operates in a global

manner, our active learning approach exploits local

characteristics of the feature space by analyzing the

density of labeled instances in different regions for

making suggestions to the user.

The approaches above mostly require large

amounts of data as well as ground truth in terms of

pairs or triplets of labeled instances. Furthermore,

they rely on numerical data (or at least non-mixed

data) as input. We propose an approach for met-

ric learning for unlabeled data (without any ground

truth) with mixed data types (categorial, binary, and

numerical), which is also applicable to small datasets

and data sets with initially no labeled instances. For

this purpose, we combine metric learning with active

learning (Yang et al., 2007) and embed it in an inter-

active visualization system for immediate feedback.

Our approach allows the generation of useful distance

metrics from a small number of user inputs.

3 APPROACH

An overview of the visual-interactive system is shown

in Figure 1. Figure 2 illustrates the interplay of the

technical components assembled to a workﬂow. In

Sections 3.2, 3.3, and 3.4, we describe the three core

components in detail, after we discuss data character-

istics and abstractions in Section 3.1.

3.1 Data Characterization

3.1.1 Data Source

Various web references provide data about soccer

players with information differing in its scope and

depth. For example some websites offer information

about market price values or sports betting statistics,

while other sources provide statistics about pass accu-

racy in every detail. Our prior requirement to the data

is its public availability to guarantee the reproducibil-

ity of our experiments. In addition, the information

about players should be comprehensible for broad au-

diences and demonstrate the applicability. Finally,

the attributes should be of mixed types (numerical,

categorical, boolean). This is why Wikipedia

the

Wikipedia, https://en.wikipedia.org/wiki/Main Page,

last accessed on September 22th, 2016

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis

Table 2: Overview of secondary data attributes about soccer players deduced from the primary data.

Attribute Description Variable Type Value Domain Quality

Age Age of the player (end of last season) Numerical (int) [16-43] perfect

National Player Whether the player has played as a national player Boolean [false, true] sparse

Nat. games p.a. Average number of national games per year Numerical [0.0, 24.0] sparse

Nat. goals per game Average number of goals per national game Numerical < 3.0 sparse

League games p.a. Average number of league games per year Numerical [0.0, 73.5] sparse

League goals p. game Average number of goals per league game Numerical < 3.0 sparse

Position Vertical The aggregated positions of the player as y-

Coordinate (from “Keeper” to “Striker”)

Numerical [0.0, 1.0] perfect

Position Horizontal The aggregated positions of the player as x-

Coordinate (from left to right)

Numerical [0.0, 1.0] perfect

Main Position LR The horizontal position of the player as String Nominal {left, center, right} perfect

free encyclopedia serves as our primary data source.

The structured information about players presented

at Wikipedia is retrieved from DBpedia

. We ac-

cess DBpedia with SPARQL (Prud’hommeaux and

Seaborne, 2008) the query language for RDF

recom-

mended by W3C

. We focused on the Europe’s ﬁve

top leagues (Premier League in England, Seria A in

Italy, Ligue 1 in France, Bundesliga in Germany, and

LaLiga in Spain). Overall, we gathered 2,613 players

engaged by the teams of respective leagues.

3.1.2 Data Abstraction

Table 1 provides an overview of the available infor-

mation about soccer players. Important attributes for

the player (re-)identiﬁcation are the unique name in

combination with the nationality and the current team.

Moreover, a various numerical and categorical infor-

mation is provided for similarity modeling.

Table 2 depicts the secondary data (i.e., attributes

deduced from primary data). Our strategy for the ex-

traction of additional information is to obtain as much

meaningful attributes as possible. One beneﬁt of our

approach will be a weighing of all involved attributes,

making the selection of relevant features for down-

stream analyses an easy task. This strategy is inspired

by user-centered design approaches in different appli-

cation domains where we asked domain experts about

the importance of attributes (features) for the similar-

ity deﬁnition process (Bernard et al., 2013; Bernard

et al., 2015a). One of the common responses was “ev-

erything can be important!” In the usage scenarios,

DBpedia, http://wiki.dbpedia.org/about, last accessed

on September 22th, 2016

RDF, https://www.w3.org/RDF, last accessed on

September 22th, 2016

RDF, https://www.w3.org, last accessed on September

22th, 2016

we demonstrate how the similarity model will weight

the importance of primary and secondary attributes

with respect to the learned pairs of labeled players.

3.1.3 Preprocessing

One of the data-centered challenges was the sparsity

of some data attributes. This phenomenon can of-

ten be observed when querying less popular instances

of concepts from DBpedia. To tackle this challenge

we removed attributes and instances from the data set

containing only little information. Remaining miss-

ing values were marked with missing value indica-

tors, with respect to the type of attribute. For illus-

tration purposes, we also removed players without an

image in Wikipedia. The ﬁnal data set consists of

1,172 players. An important step in the preprocessing

pipeline is normalization. By default, we rescaled ev-

ery numerical attribute into the relative value domain

to foster metrical comparability.

3.2 Visual-Interactive Learning

Interface

One of the primary views of the approach enables

users to give feedback about individual instances. Ex-

amples of the feedback interface can be seen in the

Figures 1 and 3. The interface for the deﬁnition of

similarity between soccer players shows two players

in combination with a slider control in between. The

slider allows the communication of similarity scores

between the two players. We decided for a quasi-

continuous slider, in accordance to the continuous

numerical function to be learned. However, one of

possible design alternative would propose a feedback

control with discrete levels of similarity.

Every player is represented with an image (when

available and permitted), a ﬂag icon showing the

IVAPP 2017 - International Conference on Information Visualization Theory and Applications

Feedback

Interpreter

Active

Learning

Support

Similarity

Learner

Similarity Model

Data

Feedback Interface Result Visualization

Use Case:

k-Nearest

Neighbors

FrontendBackend

Figure 2: Workﬂow of the approach. Users assign similarity scores for pairs of players in the feedback interface. In the

backend, the feedback is interpreted (blue) and delegated to the similarity model. Active learning support suggests players to

improve the model (orange). A kNN-search supports the use case of the workﬂow shown in the result visualization (purple).

player’s nationality, as well as textual labels for the

player name and the current team. These four at-

tributes are also used for compact representations

of players in other views of the tool. The visual

metaphor of a soccer ﬁeld represents the players’

main positions. In addition, a list-based view provides

the details about the players’ attribute values.

The feedback interface combines three additional

functionalities most relevant for the visual-interactive

learning approach.

First, users need to be able to deﬁne and select play-

ers of interest. This supports the idea to grasp detailed

feedback about instances matching the users’ expert

knowledge (Seifert and Granitzer, 2010; H

oferlin

et al., 2012). For this purpose a textual query in-

terface is provided in combination with a combobox

showing players matching a user-deﬁned query. In

this way, we combine query-by-sketch and the query-

by-example paradigm for the straightforward lookup

of known players.

The second ingredient for an effective active learn-

ing approach is the propagation of instances to the

user reducing the remaining model uncertainty. One

crucial design decision determined that users should

always be able to label players they actually know.

Thus, we created a solution for the candidate selec-

tion combining automated suggestions by the model

with the preference of users. The feedback interface

provides two sets of candidate players, one set is lo-

cated at the left and the other one right of the interface.

Replacing the left feedback instance with one of the

suggested players at the left will reduce the remaining

entropy regarding the current instance at the right, and

vice-versa. However, we are aware that other strate-

gies for proposing unlabeled instances exist. Two of

the obvious alternative strategies for labeling players

would be a) providing a global pool of unlabeled play-

ers (e.g., in combination with drag-and-drop) or b) of-

fering pairs of instances with low conﬁdence. While

these two strategies may be implemented in alterna-

tive designs, we recall the design decision that users

need to know the instances to be labeled. In this way,

we combine a classical active learning paradigm with

the user-deﬁned selection of players matching their

expert knowledge.

Finally, the interface provides a history functionality

for labeled pairs of players at the bottom of the feed-

back view (cf. Figure 1). For every pair of players

images are shown and the assigned similarity score is

depicted in the center.

3.3 Similarity Modeling

3.3.1 Similarity Learning

The visual-interactive learning interface provides

feedback about the similarity of pairs of instances.

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis

Figure 3: Similarity model learned with stars in the European soccer scene. The history provides an overview of ten labeled

pairs of players. The ranking of weighted attributes assigns high correlations to the vertical position, player size, and national

games. Karim Benzema served as the query player: all retrieved players share quite similar attributes with Benzema.

Thus, the feedback propagated to the system is ac-

cording to the learning function f (i

, i

) = y whereas

y is a numerical value between 0 (unsimilar) and 1

(very similar). Similarity learning is designed as a

two-step approach. First, every attribute (feature) of

the data set is correlated with the user feedback. Sec-

ond, pairwise distances are calculated for any given

instance of the data set.

The correlation of attributes is estimated with

Pearson’s correlation coefﬁcient. Pairwise distances

between categorical attributes are transformed into

the numerical space with the Kronecker delta func-

tion. The correlation for a given attribute is then es-

timated between the labeled pairs of instances pro-

vided by the user and the distance in the value do-

main obtained by that attribute. In the current state

of the approach every attribute is correlated indepen-

dently to reduce computation time and to maximize

interpretability of the resulting weights. The result of

this ﬁrst step of the learning model is a weighting of

the attributes that is proportional to the correlation.

In a second step, the learning model calculates dis-

tances between any pair of instances. As the under-

lying data may consist of mixed attribute types, dif-

ferent distance measures are used for different types

of attributes. For numerical data we employ the

(weighted) Euclidean distance. For categorical at-

tributes we choose the Goodall distance (Boriah et al.,

2008) since it is sensitive to the probability of attribute

values and less frequent observations are assigned

higher scores. The weighted Jaccard distance (Cha

et al., 2005) is used for binary attributes. The Jac-

card distance neglects negative matches (both inputs

false), which might be advantageous for many simi-

larity concepts, i.e. the absence of an attribute in two

items does not add to their similarity (Sneath et al.,

1973). After all distance measures have been com-

puted in separate distance matrices all matrices are

condensed into a single distance matrix by a weighted

sum, where the weights represent the fraction of the

sum of weights for each attribute type.

3.3.2 Active Learning Strategy

We follow an interactive learning strategy that allows

for keeping the user in the loop. To support the it-

erative nature, we designed an active learning strat-

egy that fosters user input for instances for which

no or little information is available yet. As a start-

ing point for active learning the user selects a known

instance from the database. Note that this is impor-

tant as the user needs a certain amount of knowledge

about the instance to make similarity assessments in

the following (see Section 3.2). After an instance

has been selected, we identify the attribute with the

highest weight. Next, we estimate the farthest neigh-

bors to the selected item under the given attribute for

which no similarity assessments exist so far. A set

of respective candidates is then presented to the user.

This strategy is useful as it identiﬁes pairs of items

for which the system cannot make assumptions so far.

The user can now select one or more proposed items

and add similarity assessments. By adding assess-

IVAPP 2017 - International Conference on Information Visualization Theory and Applications

Figure 4: Nearest neighbor search for Lionel Messi. Even if

superstars are difﬁcult to replace, the set of provided nearest

neighbors is quite reasonable. Keisuke Honda may be sur-

prising, nevertheless Honda has similar performance values

in the national team of Japan.

ments the coverage of the attribute space is success-

fully improved especially in sparse areas where little

information was available so far.

3.3.3 Model Visualization

Visualizing the output of algorithmic models is cru-

cial, e.g., to execute downstream analysis tasks (see

Section 3.4). In addition, we visualize the current

state of the model itself. In this way, designers and

experienced users can keep track of the model im-

provement, its quality improvement, and its determin-

ism. The core black-box information of this two-step

learning approach is the set of attribute weights repre-

senting the correlation between attributes and labeled

pairs of instances. Halfway right in the tool, we make

the attribute weights explicit (between the feedback

interface and the model result visualization), as it can

be seen in the title ﬁgure. An enlarged version of the

model visualization is shown at the left of Figure 4

where the model is used to execute a kNN search for

Lionel Messi. From top to bottom the list of attributes

is ranked in the order of their weights. It is a reason-

able point of discussion whether the attribute weights

should be visualized to the ﬁnal group of users of

such a system. A positive argument (especially in this

scientiﬁc context) is the transparency of the system

which raises trust and allows the visual validation.

However, a counter argument is biasing users with in-

formation about the attribute/feature space. Recalling

that especially in complex feature spaces users do not

necessarily know any attribute in detail, it may be a

valid design decision to exclude the model visualiza-

tion from the visual-interactive system.

3.4 Result Visualization –

Visual-Interactive NN Search

We provide a visual-interactive interface for the visu-

alization of the model output (see Figure 4). A pop-

ular use case regarding soccer players is the identiﬁ-

cation of similar players for a reference player, e.g.,

when a player is replaced in a team due to an upcom-

ing transfer event. Thus, the interface of the result vi-

sualization will provide a means to query for similar

soccer players. We combine a query interface (query-

by-sketch, query-by-example) with a list-based visu-

alization of retrieved players. The retrieval algorithm

is based on a standard k-NN search (k nearest neigh-

bors) using the model output. For every list element of

the result set a reduced visual representation of a soc-

cer player is depicted, including the player’s image,

name, nationality, position, and team. Moreover, we

show information about three attributes contributing

to the current similarity model signiﬁcantly. Finally,

we depict rank information as well as the distance to

the query for ever element. The result visualization

rounds up the functionality. Users can train individ-

ual similarity models of soccer player similarity and

subsequently perform retrieval tasks. From a more

technical perspective, the result visualization closes

the feedback loop of the visual-interactive learning

approach. In this connection, users can analyze re-

trieved sets of players and give additional feedback

for weakly learned instances. An example can be seen

in Figure 4 showing a retrieved result set for Lionel

Messi used as an example query.

4 EVALUATION

Providing scientiﬁc proof for this research effort

is non-trivial since the number of comparable ap-

proaches is scarce. Moreover, we address the chal-

lenge of dealing with data sets which are completely

unlabeled at start, making classical quantitative eval-

uations with ground truth test data impossible.

In the following, we demonstrate and validate the ap-

plicability of the approach with different strategies. In

a ﬁrst proof of concept scenario a similarity model is

trained for an explicitly known mental model, answer-

ing the question whether the similarity model will be

able to capture a human’s notion of similarity. Sec-

ond, we assess the effectiveness of the approach in

two usage scenarios. We demonstrate how the tool

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis

Figure 5: Experiment with a mental model based on teams

and player age. It can be seen that for ten labeled pairs of

players the system is able to grasp this mental model.

can be used to learn different similarity models, e.g.,

to replace a player in the team by a set of relevant

candidates. Finally, we report on experiments for the

quantiﬁcation of model efﬁciency.

4.1 Proof of Concept - Fixed Mental

Model

The ﬁrst experiment assesses whether the similarity

model of the system is able to grasp the mental sim-

ilarity model of a user. As an additional constraint,

we limit the number of labeled pairs to ten, repre-

senting the requirement of very fast model learning.

As a proof of concept, we predeﬁne a mental model

and express it with ten labels. In particular, we sim-

ulate a ﬁctitious user who is only interested in the

age of players, as well as their current team. In other

words, a numerical and a categorical attribute deﬁnes

the mental similarity model of the experiment. If

two players are likely identical with respect to these

two attributes (age +-1) the user assigns the similarity

score 1.0. If only one of the two attributes match, the

user feedback is 0.5 and if both attributes disagree a

pair of players is labeled with the similarity score 0.0.

The ten pairs of players used for the experiment are

shown in Figure 5. In addition to the labeled pairs,

the ﬁnal attribute weights calculated by the system

are depicted. Three insights can be identiﬁed. First,

it becomes apparent that the two attributes with the

highest weights exactly match the pre-deﬁned men-

tal model. Second, the number of national games, the

number of national games per year, and the number

of league games also received weights. Third, the set

of remaining attributes received zero weights. While

the ﬁrst insight validates the experiment, the second

insight sheds light on attributes correlated with the

mental model. As an example, we hypothesize that

the age of players is correlated with the number of

games. This is a beneﬁcial starting point for down-

stream feature selection tasks, e.g., when the model is

to be implemented as a static similarity function. Fi-

nally, the absence of weights for most other attributes

demonstrates that only few labels are needed to obtain

a precise focus on relevant attributes.

4.2 Usage Scenario 1 - Top Leagues

The following usage scenario demonstrates the effec-

tiveness of the approach. A user with much experi-

ence in Europe’s top leagues (Premier League, Se-

ria A, Ligue 1, Bundesliga, LaLiga) rates ten pairs

of prominent players with similarity values from very

high to very low. The state of the system after ten la-

beling iterations can be seen in Figure 3. The history

view shows that high similarity scores are assigned to

the pairs: Mats Hummels vs. David Luiz, Luca Toni

vs. Claudio Pizarro, Dani Alves vs. Juan Bernat, Toni

Kroos vs. Xabi Alonso, Eden Hazard vs. David Silva,

and Jamie Vardi vs. Marco Reus. Comparatively

low similarity values are assigned to the pairs Philipp

Lahm vs. Ron-Robert Zieler, Sandro Wagner cs. Bas-

tian Oczipka, Manuel Neuer vs. Marcel Heller, and

Robert Huth vs. Robert Lewandowski. The set of

player instances resembles a vast spectrum of nation-

alities, ages, positions, as well as numbers of games

and goals. The resulting leaning model depicts high

weights to the vertical position on the ﬁeld, the player

size, the national games per year. Further, being a na-

tional player and the goals for the respective national

teams contribute to the global notion of player simi-

larity. In the result visualization, the user chose Karim

Benzema for the nearest neighbor search. The result

set (Edison Cavani, Robert Lewandowski, Salomon

Kalou, Claudio Pizarro, Klass-Jan Huntelaar, Pierre-

Emerick Aubameyang) represents the learned model

quite well. All these players are strikers, in a similar

age, and very successful in their national teams. In

contrast, there is not a single player listed in the result

who does not stick to the described notion of similar-

ity. In summary, this usage scenario demonstrates that

the tool was able to reﬂect the notion of similarity of

the user with a very low number of training instances.

4.3 Usage Scenario 2 - National Trainer

In this usage scenario, we envision to be a national

trainer. Our goal is to engage a similarity model

IVAPP 2017 - International Conference on Information Visualization Theory and Applications

which especially resembles the quality of soccer play-

ers, but additionally takes the nationality of players

into account. As a result, similar players coming from

the same country are classiﬁed similar. The reason is

simple; players from foreign countries cannot be po-

sitioned in a national team. Figure 1 shows the his-

tory of the ten labeled pairs of players. We assigned

similar scores to two midﬁelders from Belgium, two

strikers from France, two defenders from Germany,

two goalkeepers from Germany, and two strikers from

Belgium. In addition, four pairs of players with dif-

ferent nationalities were assigned with considerably

lower similarity scores, even if the players play at

very similar positions. The result visualization shows

the search result for Dimitri Payet who was used as a

query player. In this usage scenario, the result can be

used to investigate alternative players for the French

national team. With only ten labels, the algorithm re-

trieves exclusively players from France, all having ex-

pertise in the national team, and all sharing Dimitri

Payets position (offensive midﬁelder).

4.4 Quantiﬁcation of Efﬁciency

In the ﬁnal evaluation strategy, we conduct an exper-

iment to yield quantitative results for the efﬁciency.

We assess the ‘speed’ of the convergence of the at-

tribute weighting for a given mental model, i.e. how

many learning iterations the model needs to achieve

stable attribute weights. The independent variable

of the experiment is the number of learned itera-

tions, i.e., the number of instances already learned by

the similarity model. The dependent variable is the

change of the attribute weights of the similarity model

between two learning iterations, assessed by the quan-

titative variable ∆w. To avoid other degrees of free-

dom, we ﬁx the mental model used in the experiment.

For this purpose, a small group of colleagues all hav-

ing an interest in soccer deﬁned labels of similarity

for 50 pairs of players.

To guarantee robustness and generalizability, we

run the experiment 100 times. Inspired by cross-

validation, the set of training instances is permuted in

every run. The result is depicted in Figure 6. Ob-

viously, the most substantial difference of attribute

weights is in the beginning of the learning process

between the 1st and the 2nd learning iteration (∆w =

0.36). In the following, the differences signiﬁcantly

decrease before reaching a saturation point approxi-

mately after the 5th iteration. For the 6th and later

learning iterations ∆w is already below 0.1 and 0.03

after the 30th iteration. To summarize, the approach

only requires very few labeled instances to produce a

robust learning model. This is particularly beneﬁcial

when users have very limited time, e.g., important ex-

perts in the respective application ﬁeld.

5 DISCUSSION

In the evaluation section, we demonstrated the appli-

cability of the approach from different perspectives.

However, we want to shed light on aspects that allow

alternative design decisions, or may be beneﬁcial sub-

jects to further investigation.

Similarity vs. Distance. Distance measures are

usually applied to approximate similarity relation-

ships. This is also the case in our work. We are, how-

ever, aware that metric distances can in general not be

mapped directly to similarities, especially when the

dimension of the data becomes high and the points

in the feature space move far away from each other.

Finding suitable mappings between distance and sim-

ilarity is a challenging topic that we will focus on in

future research.

Active Learning Strategy. The active learning sup-

port of this approach builds on the importance

(weights) of attributes to suggest new learning in-

stances to be queried. Thus, we focus on a scalable

solution that takes the current state of the model into

account and binds suggestions to previously-labeled

instances. Alternative strategies may involve other in-

trinsic aspects of the data (attributes or instances) or

the model itself. For example statistical data analy-

sis, distributions of value domains, or correlation tests

could be considered. Other active learning strate-

gies may be inspired by classiﬁcation approaches, i.e.,

models learning categorical label information. Con-

crete classes of strategies involve uncertainty sam-

pling or query by committee (Settles, 2009).

Numerical vs. Categorical. This research effort

explicitly addressed a complex data object with mixed

data, i.e., objects characterized by numerical, categor-

ical, and boolean attributes. This class of objects is

widespread in the real-world, and we argue that it is

worth to address this additional analytical complexity.

However, coping with mixed data can beneﬁt from a

more in-depth investigation at different steps of the

algorithmic pipeline.

Usability. We presented a technique that actually

works but has not been throughoutly evaluated with

users. Will users be able to interact with the sys-

tem? We did cognitive walkthroughs and created the

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis

Figure 6: Quantiﬁcation of efﬁciency. The experiment shows differences in the attribute weighting between consecutive

learning iterations. A saturation point can be identiﬁed, approximately after the 5th labeled pair of instances.

designs in a highly interactive manner. Still, the ques-

tion arises whether domain experts will appreciate the

tool and be able to work with it in an intuitve way.

6 CONCLUSION

We presented a tool for the visual-interactive similar-

ity search for complex data objects by example of soc-

cer players. The approach combines principles from

active learning, information visualization, visual ana-

lytics, and interactive information retrieval. An algo-

rithmic workﬂow accepts labels for instances and cre-

ates a model reﬂecting the similarity expressed by the

user. Complex objects including numerical, categori-

cal, and boolean attribute types can be included in the

algorithmic workﬂow. Visual-interactive interfaces

ease the labeling process for users, depict the model

state, and represent output of the similarity model.

The latter is implemented by means of an interactive

information retrieval technique. While the strategy to

combine active learning with visual-interactive inter-

faces enabling users to label instances of interest is

special, the application by example of soccer players

is, to the best of our knowledge, unique. Domain ex-

perts are enabled to express expert knowledge about

similar players, and utilize learned models to retrieve

similar soccer players. We demonstrated that only

very few labels are needed to train meaningful and

robust similarity models, even if the data set was un-

labeled at start.

Future work will include additional attributes

about soccer players, e.g., market values or variables

assessing the individual player performance. In ad-

dition, it would be interesting to widen the scope and

the strategy to other domains, e.g., in design study ap-

proaches. Finally, the performance of individual parts

of the algorithmic workﬂow may be tested against de-

sign alternatives in future experiments.

REFERENCES

Abbasnejad, M. E., Ramachandram, D., and Mandava, R.

(2012). A survey of the state of the art in learn-

ing the kernels. Knowledge and Information Systems,

31(2):193–221.

Behrisch, M., Korkmaz, F., Shao, L., and Schreck, T.

(2014). Feedback-driven interactive exploration of

large multidimensional data supported by visual clas-

siﬁer. In IEEE Visual Analytics Science and Technol-

ogy (VAST), pages 43–52.

Bell, S. and Bala, K. (2015). Learning visual similarity for

product design with convolutional neural networks.

ACM Transactions on Graphics (TOG), 34(4):98.

Bellet, A., Habrard, A., and Sebban, M. (2013). A survey

on metric learning for feature vectors and structured

data. arXiv preprint arXiv:1306.6709.

Bernard, J., Daberkow, D., Fellner, D. W., Fischer, K., Koe-

pler, O., Kohlhammer, J., Runnwerth, M., Ruppert, T.,

Schreck, T., and Sens, I. (2015a). Visinfo: A digital

library system for time series research data based on

exploratory search – a user-centered design approach.

Int. Journal on Digital Libraries, 16(1):37–59.

Bernard, J., Sessler, D., Bannach, A., May, T., and

Kohlhammer, J. (2015b). A visual active learning

system for the assessment of patient well-being in

prostate cancer research. In IEEE VIS WS on Visual

Analytics in Healthcare (VAHC), pages 1–8. ACM.

Bernard, J., Sessler, D., Ruppert, T., Davey, J., Kuijper,

A., and Kohlhammer, J. (2014). User-based visual-

IVAPP 2017 - International Conference on Information Visualization Theory and Applications

interactive similarity deﬁnition for mixed data objects-

concept and ﬁrst implementation. In Proceedings of

WSCG, volume 22. Eurographics Association.

Bernard, J., Wilhelm, N., Kr

uger, B., May, T., Schreck,

T., and Kohlhammer, J. (2013). Motionexplorer:

Exploratory search in human motion capture data

based on hierarchical aggregation. IEEE Transactions

on Visualization and Computer Graphics (TVCG),

19(12):2257–2266.

Boriah, S., Chandola, V., and Kumar, V. (2008). Simi-

larity measures for categorical data: A comparative

evaluation. International Conference on Data Mining

(SIAM), 30(2):3.

Cha, S.-H., Yoon, S., and Tappert, C. C. (2005). Enhancing

binary feature vector similarity measures.

Chopra, S., Hadsell, R., and LeCun, Y. (2005). Learning

a similarity metric discriminatively, with application

to face veriﬁcation. In Computer Vision and Pattern

Recognition (CVPR), pages 539–546. IEEE.

Fogarty, J., Tan, D., Kapoor, A., and Winder, S. (2008).

Cueﬂik: Interactive concept learning in image search.

In SIGCHI Conference on Human Factors in Comput-

ing Systems (CHI), pages 29–38. ACM.

Frome, A., Singer, Y., Sha, F., and Malik, J. (2007). Learn-

ing globally-consistent local distance functions for

shape-based image retrieval and classiﬁcation. In

Conference on Computer Vision, pages 1–8. IEEE.

onen, M. and Alpaydın, E. (2011). Multiple kernel learn-

ing algorithms. Journal of Machine Learning Re-

search, 12(Jul):2211–2268.

Heimerl, F., Koch, S., Bosch, H., and Ertl, T. (2012). Visual

classiﬁer training for text document retrieval. IEEE

Transactions on Visualization and Computer Graph-

ics (TVCG), 18(12):2839–2848.

oferlin, B., Netzel, R., H

oferlin, M., Weiskopf, D., and

Heidemann, G. (2012). Inter-active learning of ad-hoc

classiﬁers for video visual analytics. In IEEE Visual

Analytics Sc. and Technology (VAST), pages 23–32.

Janetzko, H., Sacha, D., Stein, M., Schreck, T., Keim, D. A.,

and Deussen, O. (2014). Feature-driven visual analyt-

ics of soccer data. In IEEE Visual Analytics Science

and Technology (VAST), pages 13–22.

Jeong, D. H., Ziemkiewicz, C., Fisher, B., Ribarsky, W.,

and Chang, R. (2009). iPCA: An Interactive System

for PCA-based Visual Analytics. In Computer Graph-

ics Forum (CGF), volume 28, pages 767–774. Euro-

graphics.

Kulis, B. (2012). Metric learning: A survey. Foundations

and Trends in Machine Learning, 5(4):287–364.

Malisiewicz, T., Gupta, A., and Efros, A. A. (2011). Ensem-

ble of exemplar-svms for object detection and beyond.

In Conf. on Computer Vision, pages 89–96. IEEE.

Noh, Y.-K., Zhang, B.-T., and Lee, D. D. (2010). Genera-

tive local metric learning for nearest neighbor classiﬁ-

cation. In Advances in Neural Information Processing

Systems, pages 1822–1830.

Norouzi, M., Fleet, D. J., and Salakhutdinov, R. R. (2012).

Hamming distance metric learning. In Adv. in neural

information processing systems, pages 1061–1069.

Prud’hommeaux, E. and Seaborne, A. (2008). SPARQL

Query Language for RDF. W3C Recommendation.

Salakhutdinov, R. and Hinton, G. E. (2007). Learning a

nonlinear embedding by preserving class neighbour-

hood structure. In AISTATS, pages 412–419.

Salton, G. and Buckley, C. (1997). Improving retrieval per-

formance by relevance feedback. Readings in Infor-

mation Retrieval, 24:5.

Seebacher, D., Stein, M., Janetzko, H., and Keim, D. A.

(2016). Patent Retrieval: A Multi-Modal Visual Ana-

lytics Approach. In EuroVis Workshop on Visual Ana-

lytics (EuroVA), pages 013–017. Eurographics.

Seifert, C. and Granitzer, M. (2010). User-based active

learning. In IEEE International Conference on Data

Mining Workshops (ICDMW), pages 418–425.

Sessler, D., Bernard, J., Kuijper, A., and Kohlhammer, J.

(2014). Adopting mental similarity notions of categor-

ical data objects to algorithmic similarity functions.

Vision, Modelling and Visualization (VMV), Poster.

Settles, B. (2009). Active learning literature survey. Com-

puter Sciences Technical Report 1648, University of

Wisconsin–Madison.

Shao, L., Sacha, D., Neldner, B., Stein, M., and Schreck, T.

(2016). Visual-Interactive Search for Soccer Trajecto-

ries to Identify Interesting Game Situations. In IS&T

Electronic Imaging Conference on Visualization and

Data Analysis (VDA). SPIE.

Sneath, P. H., Sokal, R. R., et al. (1973). Numerical taxon-

omy. The principles and practice of numerical classi-

ﬁcation.

Torresani, L. and Lee, K.-c. (2006). Large margin com-

ponent analysis. In Advances in neural information

processing systems, pages 1385–1392.

Tversky, A. (1977). Features of similarity. Psychological

review, 84(4):327.

Ware, M., Frank, E., Holmes, G., Hall, M., and Witten, I. H.

(2001). Interactive machine learning: letting users

build classiﬁers. Human-Computer Studies, pages

281–292.

Weber, N., Waechter, M., Amend, S. C., Guthe, S., and

Goesele, M. (2016). Rapid, Detail-Preserving Image

Downscaling. In Proc. of ACM SIGGRAPH Asia.

Weinberger, K. Q. and Saul, L. K. (2009). Distance metric

learning for large margin nearest neighbor classiﬁca-

tion. Machine Learning Research, 10:207–244.

Xing, E. P., Ng, A. Y., Jordan, M. I., and Russell, S. (2003).

Distance metric learning with application to clustering

with side-information. Advances in neural informa-

tion processing systems, 15:505–512.

Yang, L., Jin, R., and Sukthankar, R. (2007). Bayesian ac-

tive distance metric learning. In Conference on Un-

certainty in Artiﬁcial Intelligence (UAI).

Yu, J., Amores, J., Sebe, N., Radeva, P., and Tian, Q.

(2008). Distance learning for similarity estimation.

IEEE Transactions on Pattern Analysis and Machine

Intelligence (TPAMI), 30(3):451–462.

Zagoruyko, S. and Komodakis, N. (2015). Learning to com-

pare image patches via convolutional neural networks.

In Comp. Vis. and Pattern Recognition (CVPR). IEEE.

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis