A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY

BEHAVIOR IN RECOMMENDER SYSTEMS

Alan Cardoso, Daniel Rocha, Rafael Sachetto, Leonardo Rocha

Computer Science Departament, Federal University of S

ao Jo

ao Del Rei (UFSJ)

Pc. Dr. Augusto Chagas Viegas 17 - DCOMP - 36300-088, S

ao Jo

ao Del Rei, Brazil

Fernando Mour

ao, Wagner Meira Jr.

Computer Science Departament, Federal University of Minas Gerais (UFMG)

Av. Ant

onio Carlos 6627 -ICEx - 31270-010, Belo Horizonte, Brazil

Keywords:

Recommender systems, Evolutive, Characterization.

Abstract:

Recommender Systems (RSs) have become increasingly important tools for various commercial applications

on the Web. Despite numerous efforts, RSs still require improvements to make recommendation more effective

and applicable to many real scenarios. Recent studies point out the temporal evolution as a primordial manner

for improving RSs without, however, understand in detail how this evolution emerges. Thus, we propose a

methodology for evolutive characterization of users and applications in order to provide a better understanding

of this temporal dynamic in RSs. Applying our methodology in a real scenario has proved to be useful even to

help in the choice of RSs adherents of each scenario.

1 INTRODUCTION

The large volume of data available on the WEB has

generated in recent years a challenging scenario for

various applications. Users have more options that

can effectively handle (Adomavicius and Tuzhilin,

2005). Several commercial applications, such as

Amazon, Last.Fm, among others, provide a collec-

tion of items with millions of distinct products. Al-

though the availability of a wide range of options has

been a desired scenario in the past, nowadays repre-

sents a major challenge. In fact, we can state that this

large amount of options is “choking” the users, mak-

ing the simple choice of products of user interest a

difﬁcult task. In this context, the Recommender Sys-

tems (RSs), which allow ﬁltering this amount of in-

formation, showing only what can be useful to user

interest, are becoming increasingly important.

Several strategies to recommend products, infor-

mation and services to users have been proposed

recently (Adomavicius and Tuzhilin, 2001a; Burke,

2002; Abbasse and Mirrokni, 2007). The main idea

of the RSs is to estimate potentially interesting items

to users, based on a prior knowledge of their behavior,

as well as relevant characteristics of these items. Al-

though the idea is simple, its implementation presents

many computational challenges ranging from how to

model the users behavior, to how to use this mod-

eling information to provide the recommendation it-

self. For instance, user behavior can be represented

by any subset of items he has consumed, or even by

items not yet consumed but that may be relevant to

the system or to the user, given a metric of interest.

While there are numerous proposals, current RSs still

need improvements to address these challenges and

make the recommendation more effective and appli-

cable to a wider range of scenarios, such as trip ad-

vice, ﬁnancial services, among others (Adomavicius

and Tuzhilin, 2001b).

Such challenges are being addressed by incorpo-

rating some dimensions of analysis into RSs. Cur-

rently, particular attention is being given to the tem-

poral dimension (Koren, 2009), due to the dynamic

aspect of user behavior. The user taste is not a static

characteristic, exhibiting changes over time. A same

user may has distinct opinions about the same object

in different moments. Several studies agree on the im-

pact of taste shifts to the recommendation (Adomavi-

cius et al., 2005; Adomavicius and Tuzhilin, 2001b;

Koren, 2009). As a consequence, the user model-

696

Cardoso A., Rocha D., Sachetto R., Rocha L., Mourão F. and Meira Jr. W..

A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY BEHAVIOR IN RECOMMENDER SYSTEMS.

DOI: 10.5220/0003479306960706

In Proceedings of the 7th International Conference on Web Information Systems and Technologies (WTM-2011), pages 696-706

ISBN: 978-989-8425-51-5

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

ing should be continuously updated to reﬂect such

natural changes. In this sense, a major activity is

to understand and measure the variability associated

with the user behaviors and with the applications over

time, as well as the interaction between them. De-

spite the relevance of this understanding, we did not

ﬁnd any studies that analyze how this temporal evolu-

tion, which we call evolutionary behavior, emerges in

RSs.

In this work we present a methodology for evo-

lutionary characterization of users and applications,

which is divided into three main steps that represent

a hierarchical view of the RS domains. The objec-

tive is to measure a not closed set of characteristics

that vary over time and that would affect the qual-

ity of the RSs. Such information will provide sub-

sidies for the proposal of new techniques in RSs, as

well as for proper changes in traditional techniques.

For instance, through our methodology we can assess

practical issues about RSs, usually disregarded in the

literature, such as: How often users tend to consume

the same item?; How diverse is the user consumption

in a given period of time?; What is the time interval

between consecutive accesses of the users in the sys-

tem?

In order to validate our methodology, we have

chosen the Last.Fm, one of the largest virtual mu-

sical community in the world. The results showed

that Last.Fm is mainly composed by activities of new

users, that present a decreasing consumption trend.

Further, the user behaviors are concentrated in few

distinct items, exhibiting a high repetition rate in the

consumption and a very dynamic behavior, quickly

varying their set of favorite songs. Such observations

allowed to assist in identifying the most appropriate

techniques for recommendation to Last.Fm besides to

properly redeﬁne some traditional RSs assumptions.

In summary, the main contributions of this work can

be described as the proposal of a new methodology of

evolutionary characterization, and a deeper and useful

understanding of an actual recommendation domain.

The remainder of this paper is organized as fol-

lows. Section 2 discusses the main related work. Sec-

tion 3 presents our methodology of characterization.

After, in Section 4, we apply the proposal method-

ology in data derived from the Last.Fm. Finally, in

Section 5, we conclude and discuss future work.

2 RELATED WORK

Recommender Systems (RSs) play currently an im-

portant role in e-commerce systems, assisting users in

ﬁnding their favorite items and services. At this way,

several studies propose new strategies to recommend

products, information and services to users in vari-

ous domains (Burke, 2002). However, several chal-

lenges make the effectiveness and applicability of cur-

rent techniques inadequate for many scenarios (Ado-

mavicius and Tuzhilin, 2005). Some of these chal-

lenges have been studied extensively, and metrics that

allow to identify and to measure them in real domains

are recurrently investigated.

A ﬁrst challenge consists of modeling the user be-

havior. Since each user can be modeled through a

distinct subset of objects (e.g., only for objects con-

sumed by him, or by objects considered relevant to the

domain), identifying the best model represents a com-

plex task. Nevertheless, most studies about user mod-

eling in RSs are done in a simplistic way, without take

into account some relevant characteristics of the user

behavior, such as the items relevance for each user.

For instance, metrics that quantify the consumption

diversity of each user may provide useful information

about the appropriate size of the object sets that model

the users. A second challenge refers to data sparsity,

established by the very nature of commercial appli-

cations. As the number of distinct objects in these

domains is generally huge, users are able to consume

only a small portion of the existing items. Moreover,

there is a high concentration of users around a few dis-

tinct objects followed by a downward concentration

around other objects, a phenomenon known as heavy

tail (Anderson, 2006), accentuating the data sparsity.

In this context, measuring the emergence of new users

and items over time in recommendation domain al-

lows to identify the actual impact of sparsity in RSs.

Some studies even suggest speciﬁc techniques to ad-

dress this problem in RSs (Wu and Li, 2008).

Another common challenge in RSs consists of

providing diversiﬁed recommendations (McSherry,

2002; Lathia et al., 2010). Although the domains

where the RSs operate have a wide variety of items,

the recommendations are generally somewhat diver-

siﬁed. In (Zhang and Hurley, 2008), for example, the

authors model the diversity of the recommendation as

an optimization problem. We also can point out the

so called Cold Start problem as a challenge for RSS.

The Cold Start refers to the difﬁculty in making rec-

ommendations on new items or for new users, since

there is little information in the system about such

items and users (Schein et al., 2002). In fact, one ma-

jor challenge is to provide precise recommendations

when little is known about the users (Adomavicius

and Tuzhilin, 2005).

More recently, a new challenge has been analyzed

in RSs: the temporal evolution of the data (Koren,

2009; Cremonesi and Turrin, 2010). Traditionally,

A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY BEHAVIOR IN RECOMMENDER SYSTEMS

697

RSs are based on the premise that users past behav-

ior repeats in the future. However, this assumption

is not always true, since data may change over time.

For instance, new objects appear and opinions about

the same objects vary over time. Thus, the analysis of

these data need to ﬁnd a balance between penalizing

time effects that have low impact on future behavior,

while capturing trends that reﬂect inherent recurring

patterns in the data.

Efforts on temporal evolution in RSs can be clas-

siﬁed into two groups. The ﬁrst one includes studies

which focus on assessing the quality of the recom-

mendations over time. In (Lathia et al., 2009), the im-

pact of temporal dynamics on the recommendations

is evaluated. In (Zhang and Hurley, 2008), the au-

thors assess how the diversity of the recommendation

is affected over time. In the second group we have

works that propose new recommendation models that

take into account the temporal evolution (Cremonesi

and Turrin, 2010). In (Koren, 2009), the authors ar-

gue that proposing recommendation models that take

into consideration the time tends to be more effective

than proposing complex models. Thus, variations on

the proﬁles of the users over time have been incorpo-

rated to RSs (Stern et al., 2009).

Our work differs from others by analyzing how

the temporal dynamics emerges in recommendation

scenarios, as well as by evaluating how some met-

rics related to the aforementioned challenges behave

over time. Despite various efforts, we did not ﬁnd

any studies that aim to characterize and to understand

the temporal evolution in RSs. We believe that this

understanding is relevant not only to propose tech-

niques that address the temporal dynamics properly,

but also to provide a better understanding of the other

recommendation challenges. Quantifying each of the

mentioned challenges, and how they evolve over time,

allow us to identify which should be prioritized, and

consequently, what techniques are best suited to each

domain.

3 METHODOLOGY

In this section we present our characterization

methodology of the evolutionary behavior in recom-

mendation environments. In order to characterize

different dimensions of each domain, we divide this

methodology in three main steps, namely System

Context Analysis, Interaction Analysis and Users

Proﬁle Analysis. Each step has a not closed set of

metrics that can capture relevant aspects of the do-

main, which vary over time and that may affect the

quality of the recommendation. The choice of these

metrics is based on their correlation with the main

challenges currently studied in RSs, as pointed out by

previous studies described in section 2. Further, new

metrics can be also incorporated into our methodol-

ogy as other relevant aspects are identiﬁed.

It is noteworthy that, although these steps are in-

dependent and can be applied separately, they repre-

sent a hierarchical view of the recommendation en-

vironments. The objective of the System Context

Analysis is to evaluate how the supply of items is de-

ﬁned over time as well as to identify the business rules

established by the recommendation domain. After,

we evaluate in the Interaction Analysis how users

interact with the system during their lifetime. Fi-

nally, in the Users Proﬁle Analysis, we characterize

how the users behave, regarding the consumption of

items available in the domain, and how this behavior

changes over time. In the subsequent sections, we de-

scribe in detail the goals and major metrics related to

each of the above steps of our methodology.

3.1 System Context Analysis

This ﬁrst step of our methodology aims to understand

the evaluated environment. Identifying inherent char-

acteristics of the objects and the interaction pattern

between objects and users, deﬁned by the environ-

ment, represents the main direction of this analysis.

For example, the “consumption” of songs may dif-

fer essentially from the “consumption” of videos. It

is veriﬁed since it is assumed that users listen to the

same songs again and again more often than watch

the same videos. Another relevant aspect would be

the distribution of items popularity in each domain.

While in some areas, such as songs and videos, pop-

ular items are orders of magnitude more consumed

than unpopular items in others, such as restaurant

recommendation, this difference is not as prominent.

Therefore, we consider how these features deﬁne a

set of general “parameters” for the recommendation

in each domain, by being able to inform the recom-

mender systems aspects that enable them to adapt to

each domain. For instance, knowing that a same user

often consumes the same item repeatedly may suggest

to the recommender that it can make the same recom-

mendation for the same user more than once.

In order to accomplish this analysis, there is a list

of metrics in Table 1 that we consider relevant. The

choice of such metrics was done through a systematic

collection of information that can be directly used by

recommender systems. It is important to mention that,

as in the other steps, we are not interested in list a

closed set of all existing and possible to be measured

“parameters”.

WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies

698

Table 1: Metrics for System Context Analysis (CA).

Metric Description

Distribution of lifetime in the system (CA-1) For each user, we determine his lifetime, from his registration moment until the analysis

moment. At this way, we are able to verify if users of a given domain tend to remain in

the system for a long time.

Distribution of consumed items (CA-2) For each user, we measure the total number of items consumed throughout his lifetime

until the analysis moment. This distribution shows how many items, in general, users

consume in the domain.

Distribution of items popularity (CA-3) For each distinct item, we determine the number of different users who have consumed

it, at least once, during the period of analysis. Therefore, we evaluate the probability

of an item become popular in the system.

Emergence rate of users and distinct consumed

items (CA-4)

For each moment of analysis, we calculate the number of distinct items and users in

the system. Thus, it is possible to identify the diversiﬁcation trend of the environment

in terms of users and objects.

Repetition rate in items consumption (CA-5) For each moment of analysis, we divide the total number of consumed items by the

number of distinct consumed items at each moment. This information is particularly

useful since it measures the recommender “freedom” in offering repeatedly a same item

to a same user.

Table 2: Metrics for Interaction Analysis (IA).

Metric Description

System usage time (IA-1) It is deﬁned as the amount of temporal units of a user lifetime that he actually have

consumed items or have accessed the system.

Interval between accesses over time (IA-2) The interval between accesses is given by the average time interval between consecutive

accesses of a same user. An analysis of these intervals over time shows whether users

are gradually abandoning the system or not.

Consumption frequency per lifetime (IA-3) It represents the average number of items consumed by users in each distinct moment

of their lifetimes. This information may be relevant for evaluating if the system has an

increasing trend of consumption. An important issue closely related to this analysis is

whether “older” users in the system have a higher consumption proﬁle or not.

Distribution of consumption rate per access (IA-4) It is deﬁned by dividing the number of requests that an item gets by the number of

times it was consumed (in scenarios where such distinction is valid).

3.2 Interaction Analysis

Having identiﬁed the characteristics of the recom-

mendation domain, our next step consists of under-

standing how the users use and interact with the sys-

tem. Aspects such as how often they consume items,

the temporal gap between consecutive consumptions,

the system usage assiduity, among others, represent

important information about how the systems are

used. The usefulness of such information to the rec-

ommenders can be illustrated by considering the in-

formation from the Interval between accesses over

time, as deﬁned in Table 2. Users with a smaller inter-

val between accesses may impose a stronger require-

ment of diversity for their recommendation list, since

consuming very similar items in a short period of time

may annoy the users.

The major metrics deﬁned for this step are de-

scribed in Table 2. As mentioned, the proposed set

of metrics can be expanded to capture other relevant

information to the recommenders. For example, met-

rics based on access log, which determine the user

navigation paths in the system, or access time, among

others, represent potentially relevant information.

3.3 Users Proﬁle Analysis

Finally, our methodology focuses on understanding

the behavior of users in the system regarding the con-

sumed object, deﬁned as the user proﬁle. In fact, this

step represents a quantitative analysis about the users

behavior. Such analysis is based on two main dimen-

sions: the diversity of items consumed by each user as

well as the temporization of their actions. We mean

by temporization the measurements of the time inter-

vals between the actions of a user on the system in

order to understand recurrent behavioral patterns over

time. The understanding of this proﬁle is essential to

guide the recommendation in an individualized strat-

egy. For example, knowing that users, or even a spe-

ciﬁc user, have an average diversity in consumption

of X distinct items per week suggests that the recom-

mender should not provide more than X distinct items

per week to each user.

A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY BEHAVIOR IN RECOMMENDER SYSTEMS

699

Table 3: Metrics for Diversity Analysis (DA) of Consumption.

Metric Description

Diversity Distribution (DA-1) For each user, we determine the number of distinct items consumed throughout his

lifetime in the system. Such information describe the consumption proﬁles of the users

regarding the items diversiﬁcation.

Average diversity per lifetime (DA-2) For each user “age” in the system, measured through the temporal unit of analysis, we

verify how many distinct items on average the users consume. Thus we can identify

the trend of diversiﬁcation of the users consumption proﬁles during their lifetime.

Items Diversity in an ordered set of size N (DA-3) For each moment X of analysis, we measure the percentage of overlap between the N

most relevant items for a user u

at the moment X and the N most relevant items for u

at each distinct moment Y after X .

Relevance Variability of the Items (DA-4) It determines the mean value of relevance of the items consumed by each user at each

distinct moment.

It is important to point out that the deﬁnition of

some mentioned metrics, both for the Diversity of

Consumption Analysis and for the Actions Tempo-

rization, uses an ordered set of items for each user,

that is based on a relevance measure. We can deﬁne

the relative relevance of the items considering differ-

ent aspects, such as consumption frequency, similar-

ity between items, transition probability in a “naviga-

tion” network between items, among others. Thus,

most relevant items are in top positions in the set,

while less important ones are maintained in the last

positions. Moreover, the proposed analysis can be

performed by considering different temporal granu-

larities (e.g., weeks, months, semesters, among oth-

ers) as well as different sizes of item sets. At this

way, it is possible to deﬁne a broader evaluation, able

to contrast the evolutionary behavior of users in dif-

ferent temporal granularities, as well for distinct sizes

of item sets.

Next, we deﬁne the main metrics used for both

dimensions of the Users Proﬁle Analysis:

• Diversity of Consumption: it aims to evaluate how

users behave in terms of diversity of consumed

objects. Moreover, variations of this diversity

over time, as well as its evolutionary trend is the

research focus in this step of the analysis. The

proposed metrics for this analysis are presented in

Table 3.

• Actions Temporization: it aims to understand re-

current behavioral patterns over time, such as the

time period in which users consume the items and

the time required for an item no longer be con-

sumed or be re-consumed. The proposed metrics

for this analysis are presented in Table 4.

In the following section we apply the metrics

described in this section on data derived from the

Last.Fm, bringing up the opportunity to further dis-

cuss the main concepts related to each of them. More-

over, we present the conclusions that can be obtained

by applying our methodology, as well as several kinds

of useful information that can be exploited by RSs.

Finally, we discuss how such information can help to

identify the most promising RSs techniques for do-

mains with distinct characteristics.

4 CASE STUDY

4.1 Database Description

In order to set the context, we present the dataset used

in our analysis. We use a dataset from Last.Fm sys-

tem

, which is an UK-based Internet radio and music

community website, founded in 2002. At the moment

of the data acquisition, it claimed over 30 million ac-

tive users. It is also estimated that Last.Fm had more

than 27 million different tracks and 12 million dis-

tinct artists in its database

. As Last.Fm represents

one of the biggest musical community in the world,

and since all the data is readily available on the WEB,

it is a good data source for music recommender sys-

tems.

Our analysis was performed on a sample of data

from Last.Fm. These data were collected through

an API provided by Last.Fm

. This API allows us

to obtain information related to several data entities

such as artists, albums, tracks, users, among others.

We consider as relevant to our analysis only infor-

mation related to users and tracks. Such informa-

tion was collected for a set of 146,973 distinct users

and 1,515,258 distinct tracks in the time interval from

02/12/2005 to 04/26/2009.

Available at http://www.last.Fm/.

These information were retrieved from the Last.Fm

Radio Announcement, on 03/25/2009, available at

http://blog.last.fm/2009/03/24/lastfm-radio-announcement.

http://www.last.fm/api.

WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies

700

Table 4: Metrics for Temporization Analysis (TA) of Actions.

Metric Description

Distribution of Stability Period (TA-1) It determines the continuous period of time that items remain as the most relevant to

each user. A distribution of these values shows if, in general, items remain relevant for

a long period in a domain.

Probability of Re-execution (TA-2) It determines the probability of an item more relevant in the moment X be consumed

at least once in every moment Y after X . This measure is related with the possibility of

an item to be relevant again in future, given that it is relevant in the present.

Saturation Time (TA-3) It determines the average time required for an item no longer be consumed, from the

moment it was ﬁrst consumed by each user.

Probability of Return (TA-4) It determines the probability of an item that has been relevant for a user in the past, but

is no longer consumed by him, come back to be consumed again.

4.2 System Context Analysis

−6

−5

−4

−3

−2

1 10 100 1000

Probability of Occurrence

Lifetime (Weeks)

(a) CA-1. (b) CA-2.

0 50 100 150 200 250

Total Number

Week

distinct songs

distinct users

(d) CA-4.

100

200

300

400

500

600

700

800

0 50 100 150 200 250

Repetition Rate

Week

(e) CA-5.

Figure 1: Context Analysis Metrics.

The plots in Figure 1 present the metrics described for

the Context Analysis in the previous section. We start

our analysis by the distribution of the users lifetime

in the system, shown in Figure 1 (a). We can see that

most collected users have between 10 and 50 weeks

of lifetime in the system. Very short periods (less than

5 weeks) or very long ones (over 150 weeks) have a

very low probability of occurrence. This shows that

although users explore the system for some period,

Last.Fm is not able to keep them assiduous for a long

time.

In Figure 1 (b) we present the distribution of to-

tal number of songs listened to by the users. We ob-

serve that the probability of users listen to few songs

in the system is very high. Furthermore, it is impor-

tant to realize that this distribution follows a power

law, emphasizing the huge difference of probabili-

ties between listening to few songs and many songs.

Thus, in addition to remain for a short period of time

in the system, users tend to consume few items, stress-

ing the Cold Start problem for recommendation in the

Last.Fm (Schein et al., 2002).

Our next analysis concerns to the distribution of

song popularity in the system, as shown in Figure 1

(c). Most of the tracks in our dataset have been lis-

tened to by less than 100 distinct users, and only a

very restricted number of them have been listened to

by many users. Thus, Last.Fm represents a scenario

in which very few items manage to become popu-

lar. Consequently, simple recommendations strate-

gies based on popularity would not be appropriate for

most of the items.

The graphic in Figure 1 (d) presents the number

of distinct users and items that arise in the system at

each moment. Since we do not have the whole set

of users and tracks that appear in the Last.Fm glob-

ally, this plot, in fact, exhibit the set of new users that

appear in our dataset at each week and the number of

distinct songs that were listened to by at least one user

from this set at each week. As we can see, there is a

signiﬁcant increase of unique users present in the sys-

tem each week, and also an increase of distinct mu-

sics. Such growth in the number of users and tracks

is slightly rising, generating an increasing sparsity of

information for RSs, which, to be effective, must be

skilled in dealing with this problem.

Finally, we analyze the repetition rate in the con-

sumption, as presented in Figure 1 (e) . We note

that Last.Fm not only constitutes an environment with

high repetition rate, but also by a growing trend in that

rate. That is, despite the amount of distinct tracks in-

creases over time, users tend to listen to more often

the same tracks over time. This observation has two

A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY BEHAVIOR IN RECOMMENDER SYSTEMS

701

important implications for recommenders. The ﬁrst

one is that items already consumed by the users could

be recommended again for several times. The second

implication is that the recommenders should be ro-

bust to the super specialization problem, avoiding the

recommendations to be always anchored in the items

already known by the users.

In summary, the context analysis suggests that

Last.Fm represents a challenging scenario to recom-

menders, since, despite the huge diversity of users

and items, the usage history for most users is small,

concentrated in few items, and is characterized by a

high repetition rate in consumption. Further, sim-

ple popularity-based strategies for recommendation

seems to be not effective in this scenario.

4.3 Interaction Analysis

100

150

200

250

50 100 150 200 250

Usage Time (Weeks)

Lifetime (Weeks)

Ideal Scenario

(a) IA-1. (b) IA-2.

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0 50 100 150 200 250

Normalized Number

of listened tracks

Lifetime (Weeks)

Figure 2: Interaction Analysis Metrics.

We start the interaction analysis in the Last.Fm ob-

serving the Figure 2 (a), which shows the Usage Time

per Lifetime for all users. Both metrics are calcu-

lated considering the number of distinct weeks. The

curve called “Ideal Scenario” represents the scenario

in which the usage time is equal to the lifetime, that

is, the users listen to songs in the system every week

during their lifetime. It is interesting to note that over

time users tend to use less frequently the system. In

the Last.Fm, as users “grow old” in the system, the

usage time tends to move away from the lifetime. In

this case, more decisive recommendation strategies

are necessary in order to keep users in the system.

Our next analysis concerns about the average in-

terval between system accesses. Figure 2 (b) presents

the Complementary Cumulative Distribution Func-

tion (CCDF) for these interval. Actually, as we do

not have the access log for the Last.Fm, we measure

the interval between consecutive listenings for each

user. We can observe that only 0.1% of the users

have an interval greater than 10 weeks, that is, the

vast majority of users have a small average interval

between listenings. Moreover, as the time interval in-

creases, the percentage of users decreases, following

two distinct linear regimes: at ﬁrst, until about the

tenth week, the decay is sharp, from this point the de-

cay becomes smoother. This result, associated to that

presented in Figure 1 (a), shows that in Last.Fm new

users use the system frequently, but tend to abandon it

quickly instead of gradually reducing their system us-

age. This information is important for recommenders,

which can prioritize users with an interval between

listenings that is slightly more than 1 week, since they

will potentially abandon the system.

Figure 2 (c) shows the average number of songs

listened to by users over time. This average was cal-

culated using the values normalized by the maximum

number of songs listened to by each user in a single

week. As we can note, there is a noticeable decay in

the number of songs listened to over time. In general,

users tend to use the system less frequently. Again,

RSs may be relevant tools to reduce this decay in the

usage frequency.

From the analysis of this step we can better

discern the interaction between the users and the

Last.Fm, characterizing the latter as a system primar-

ily composed by activities performed by new users,

which present a downward consumption trend. Such

behavior may be related to some type of system limi-

tation, such as poor quality recommendations or lack

of some features in the system. Some of the metrics

deﬁned in Section 3 could not be evaluated because

we do not have information about access in our data.

4.4 Users Proﬁle Analysis

As described in Section 3, we divide our user proﬁle

analysis into two steps: Diversity Analysis and Tem-

porization Analysis. The following subsections de-

scribe the main results found at each step. The under-

standing of this proﬁle, along with the understanding

of the environment and the user interactions, previ-

ously obtained, comprises a set of extremely rich and

useful information for RSs.

4.4.1 Consumption Diversity

In Figure 3 (a) we present the diversity distribution

of items consumed by the users. We observe that the

vast majority of users (more than 90%) listen to up to

200 distinct songs, while few users have a very high

diversity of songs (i.e., above 5,000). This shows that

users in Last.Fm have a consume behavior focused on

WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies

702

(a) DA-1.

100

120

140

160

180

0 50 100 150 200 250

Mean Diversity Per User

Lifetime (Weeks)

(b) DA-2.

0.05

0.1

0.15

0 50 100 150 200 250

Intersection Percentage

Temporal Distance (Weeks)

0.75

0.8

0.85

0.9

0.95

1 10 100 1000 10000

Normalized Mean Relevance

Temporal Distance (Weeks)

(d) DA-4.

Figure 3: Diversity Analysis Metrics.

a small number of distinct items. Thus, it is possible

to draw distinct consumer proﬁles based on how di-

verse are their consumption historic, and RSs can pro-

vide recommendations with diversity levels according

to each consumer proﬁle.

In addition, Figure 3 (b) presents the mean diver-

sity of consumed items per user over time. We ob-

served that users listen to, on average, 150 distinct

songs in their ﬁrst weeks of life. However, as the users

“grow older” in the system, the diversity of consumed

items reduces signiﬁcantly. This information may be

relevant to RSs, since it informs to them, at each mo-

ment, the limit of items diversity to be recommended.

We also analyzed the intersection between the

most relevant items consumed by users at distinct mo-

ments over time, as shown in Figure 3 (c). In this

case, we deﬁned as relevance metric the frequency

with which each distinct track was listened to by each

user in each single week. After, we select for analy-

sis only the 100 most relevant tracks for each user in

each week. At the end, we calculate the intersection

percentage between each week X and all other further

weeks Y , and deﬁne a mean intersection percentage

per temporal distance between X and Y . We observe

that the percentage of intersection is very low even

between adjacent weeks (i.e., less than 15%). More-

over, this intersection quickly decreases over time,

obeying a power law and achieving a value close to

zero in 54 weeks. That is, within one year, the users

habit changes almost completely. At this way, we can

guide the recommenders about the percentage of mu-

sic currently consumed by each user that should be

changed at each different time interval (e.g., every

week, month or semester).

Finally, we show in Figure 3 (d) the analysis of rel-

evance variability of the consumed items over time.

Again, we deﬁned the relevance metric as the fre-

quency with which each song was listened to at each

distinct week. The frequency of each pair user/song

was normalized by the highest frequency with which

the user listened to, in a single week, the song. Start-

ing from the ﬁrst time a user listened to each song,

we calculated the average relevance of all songs at

each subsequent moments. Finally, we deﬁned an av-

erage value of relevance for all pairs of user/music

observed in each of these subsequent moments. We

observe that the average song relevance tends to be

high at moments close to when they were listened to

for the ﬁrst time. In the later moments, this relevance

decreases sharply, stabilizing at very low values. This

shows that a song remains interesting to the users only

for a short period of time, close to the ﬁrst moment

it was listened to. Since Last.Fm is an environment

with a high consumption repetition, this relevance de-

cay suggests that repeated recommendations are more

prone to succeed at moments near to the ﬁrst time that

each song was listened to.

4.4.2 Actions Temporization

−6

−5

−4

−3

−2

−1

0 5 10 15 20

CCDF of Users

Temporal Distance (Weeks)

(a) TA-1.

−3

−2

−1

Probability of Re−execution

Temporal Distance (Weeks)

(b) TA-2.

−6

−5

−4

−3

−2

−1

0 10 20 30 40 50 60 70

CCDF of Users

Temporal Distance (Weeks)

−7

−6

−5

−4

−3

−2

−1

Probability of Return

Temporal Distance (Weeks)

(d) TA-4.

Figure 4: Actions Temporization Metrics.

First, we evaluate the distribution of stability pe-

riods in the Last.Fm, Figure 4 (a). For this analysis

we considered only the ﬁve most relevant songs (i.e.,

favorite songs) to each user at each distinct moment

over time. Further, we deﬁned temporal distances for

each song considering the ﬁrst time that it becomes

favorite for a user. For instance, the temporal distance

zero refers to the ﬁrst moment in which a song ﬁgures

between the 5 favorite songs for a user. We can ob-

serve that, on average, almost all songs ﬁgure as the

5 most relevant ones for a user about two consecutive

weeks. However, in the third week (i.e., temporal dis-

A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY BEHAVIOR IN RECOMMENDER SYSTEMS

703

tance 2) only 2.5% of the songs remains among the

favorites. This clearly shows that the users preference

remains stable for a short period, generating a certain

dynamics in the set of favorite songs that should be

captured by RSs.

Our next evaluation is about the probability of re-

execution of the favorite songs. Thus, we present in

Figure 4 (b) the average probability of a favorite song

be listened to at each temporal distance starting from

the week in which it became favorite. We observed

that only a third of the songs that have become the

favorite are listened to again in the next week. More-

over, only 23% of them are listened to again after two

weeks. This decay follows a power law until the hun-

dredth week. These observations suggest that recom-

mending favorite songs after a few weeks would not

be a good strategy, since the users might be “satu-

rated” quickly. Another interesting point is that, from

the hundredth week, the probability of a user listen to

their favorite songs is almost zero, which shows that

older songs may become “forgotten” by the users.

In Figure 4 (c) we show the results for the satu-

ration time analysis. For this analysis we measured

how long, on average, a user can continually listen to

songs that have been, at some point, between the 5

favorites. We can observe that the probability of any

music been listened to by two (98%) or three (62%)

consecutive weeks, after to become one of the ﬁve

favorites is much higher than during more than four

weeks (6% only). This shows that, in general, users

are interested in listening to a song for a short period

of time, presenting subsequently a signiﬁcantly lower

interest for these songs.

We also evaluate the average probability of return

for the favorite songs of each user. For such analysis,

we evaluate the probability of the weekly most rele-

vant song X of each user to ﬁgure between his 100

favorite songs at each distinct week after the ﬁrst time

X has no longer been listened, deﬁned as the temporal

distance zero. Figure 4 (d) presents the distribution of

the average probability of return for all favorite songs

at each moment. We note that just over 15% of these

songs are listened to a week after the ﬁrst week they

are not observed between the 100 most relevant songs.

The probabilities present a descendant behavior that

also follows a power law for temporal distances up to

approximately 150 weeks. This shows that, in fact,

over time the songs become “forgotten”. Therefore,

we have two important implications for RSs. The ﬁrst

one is that once the user stops listening to a favorite

song, it will hardly return to listen to it in the near fu-

ture. The second implication is that in a distant future

such songs could be a good way to diversify the rec-

ommendation to users.

The analysis of this step allows us to deﬁne

Last.Fm users as presenting a not very diversiﬁed con-

sumption, besides deﬁning a very dynamic behavior,

quickly varying the set of favorite songs. In addition,

each song is most often consumed at moments close

to the ﬁrst time users listened to them. We also note

that such users listen to the same song during a short

and continuous period of time, and once they stop lis-

tening to it, they will hardly listen to it again.

Finally, it is important to note that several of the

metrics analyzed in this section may be reevaluated

varying both temporal granularities and distinct sizes

of ordered sets. Thus, it is possible to contrast the evo-

lutionary behavior considering different perspectives

of analysis. In the plots of Figure 5, for example, we

present an analysis of the distribution of stability pe-

riod for various temporal granularities, aggregated by

different sizes of ordered sets. An interesting aspect

to note is that as we increase the temporal granularity,

the decay of probabilities is more pronounced, show-

ing that in longer periods of time there is a larger di-

versity of items consumed by users. We also note that,

as we increase the size of the analyzed set, the differ-

ences of probabilities between major and minor tem-

poral granularities tend to be higher. This shows that

larger sets of favorite items are more stable at lower

temporal granularities. Thus, it would be possible to

deﬁne distinct recommendation strategies by combin-

ing the user behavior in distinct temporal granulari-

ties, as well as the sizes of ordered sets, in order to

provide more precise recommendations.

−6

−5

−4

−3

−2

−1

0 5 10 15 20

CCDF of Users

Temporal Distance (Weeks)

Week

Biweek

Month

(a) Top-5.

−6

−5

−4

−3

−2

−1

0 5 10 15 20 25 30

CCDF of Users

Temporal Distance (Weeks)

Week

Biweek

Month

(b) Top-25.

−6

−5

−4

−3

−2

−1

0 5 10 15 20 25 30 35 40

CCDF of Users

Temporal Distance (Weeks)

Week

Biweek

Month

Figure 5: Analysis of AT-1 Metric for distinct Top-N.

In general, the information from each step of our

methodology can assist in choosing the most appro-

priate recommendations techniques for each scenario.

For Last.Fm, for example, we observed that the Cold

WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies

704

Start issue is more prominent than the problem of

sparsity. Therefore, techniques such as those pro-

posed in (Schein et al., 2002) are preferable. More-

over, we note that a model based only on the top-N

musics listened to by a user may be more promis-

ing than to consider all the musics he listened to in

a given period, since the musics diversity is low and

there is a high repetition rate in the system. Consid-

ering the challenge of temporal evolution, we found

that smaller temporal granularities are better both to

model user behavior and to update the “knowledge”

of the RSs, since users in the Last.Fm are highly dy-

namic, changing the set of favorite songs quickly. Fi-

nally, it’s worth to note that, in this domain, the diver-

siﬁcation of the recommendation is an issue bigger

than the accuracy, since users often consume items

that they already know. Thus, techniques to diversify

recommendation like those presented in (Zhang and

Hurley, 2008) are particularly relevant to Last.Fm.

A point to be highlighted is the need to validate

these observations and conclusions. For this purpose,

we adopted as validation strategy the implementation

of a new RS method, which incorporates many of the

observations raised up, and the subsequent contrast

of results between the proposed technique and tradi-

tional ones. This strategy is currently in development

and its application on the Last.fm represents our next

step.

5 CONCLUSIONS AND FUTURE

WORKS

In this work, we present a methodology for evolution-

ary characterization of users and applications in order

to provide subsidies for the proposal of new recom-

mendation techniques, as well as for proper changes

in traditional techniques. Despite the relevance of the

temporal evolution over the recommendation, we did

not ﬁnd studies that aim to understand and character-

ize such evolution. In turn, our methodology is useful

for assessing practical issues about RSs, usually dis-

regarded in the literature.

In order to verify the applicability of our method-

ology, we evaluate as a case study the music system

Last.Fm. Measurements of a not closed set of char-

acteristics pertinent to recommendation domains al-

lowed us to better understand the Last.Fm. This is an

environment with a wide diversity of items, as well as

a growing number of users and new items over time.

However, Last.Fm fails to keep its users for a long

time, becoming anchored by new users, who present

a low and declining system usage rate. These ﬁndings

allow us to identify a series of challenges for recom-

mender systems, such as increasing sparsity of the do-

main, little information about the users, and the Cold

Start problem.

The relevance of this type of observation goes be-

yond the understanding of the environment. It allows

us to better identify which recommendation tech-

niques would be more appropriate for each type of en-

vironment, and better adjust RSs in order to provide

better recommendations. For instance, for Last.FM

we veriﬁed that speciﬁc techniques that address the

problems of Cold Start and low diversity in recom-

mendations are more relevant.

As future work, we aim, at ﬁrst, to validate our

conclusions. For such, we are implementing a new

RS method, which incorporates several of the obser-

vations raised up, that will be applied to the Last.fm.

Results of this new technique will be contrasted with

traditional ones, in order to verify the relevance of

these new information. Later, we aim to analyze other

metrics, as well as to apply our methodology in other

recommendation scenarios. .

ACKNOWLEDGEMENTS

This work was partially supported by CNPq, CAPES,

FINEP, Fapemig, and INWEB.

REFERENCES

Abbasse, Z. and Mirrokni, V. (2007). A recommender sys-

tem based on local random walks and spectral meth-

ods. In Proceedings of the 9th WebKDD and 1st SNA-

KDD, pages 102–108. Springer.

Adomavicius, G., Sankaranarayanan, R., Sen, S., and

Tuzhilin, A. (2005). Incorporating contextual infor-

mation in recommender systems using a multidimen-

sional approach. ACM Transactions on Information

Systems (TOIS), 23(1):103–145.

Adomavicius, G. and Tuzhilin, A. (2001a). Expert-driven

validation of rule-based user models in personaliza-

tion applications. Data Mining and Knowledge Dis-

covery, 5(1):33–58.

Adomavicius, G. and Tuzhilin, A. (2001b). Extending rec-

ommender systems: A multidimensional approach.

In Proceedings of the Workshop on Intelligent Tech-

niques for Web Personalization (ITWP2001), pages 4–

Adomavicius, G. and Tuzhilin, A. (2005). Toward the

next generation of recommender systems: A sur-

vey of the state-of-the-art and possible extensions.

IEEE transactions on knowledge and data engineer-

ing, 17(6):734–749.

A CHARACTERIZATION METHODOLOGY OF EVOLUTIONARY BEHAVIOR IN RECOMMENDER SYSTEMS

705

Anderson, C. (2006). The long tail: How endless choice is

creating unlimited demand. Random House Business

Books.

Burke, R. (2002). Hybrid recommender systems: Survey

and experiments. User Modeling and User-Adapted

Interaction, 12(4):331–470.

Cremonesi, P. and Turrin, R. (2010). Controlling Consis-

tency in Top-N Recommender Systems. In 2010 IEEE

International Conference on Data Mining Workshops,

pages 919–926. IEEE.

Koren, Y. (2009). Collaborative ﬁltering with temporal dy-

namics. In Proceedings of the 15th ACM SIGKDD,

pages 447–456. ACM New York, NY, USA.

Lathia, N., Hailes, S., and Capra, L. (2009). Evaluating col-

laborative ﬁltering over time. In In ACM SIGIR Work-

shop on the Future of IR Evaluation, Boston, USA.

Lathia, N., Hailes, S., Capra, L., and Amatriain, X. (2010).

Temporal diversity in recommender systems. In SI-

GIR ’10: Proceeding of the 33rd international ACM

SIGIR conference on Research and development in in-

formation retrieval, pages 210–217, New York, NY,

USA. ACM.

McSherry, D. (2002). Diversity-conscious retrieval. Ad-

vances in Case-Based Reasoning , pages 27–53.

Schein, A. I., Popescul, A., Ungar, L. H., and Pennock,

D. M. (2002). Methods and metrics for cold-start

recommendations. In Proceedings of the 25th SIGIR,

pages 253–260. ACM Press.

Stern, D. H., Herbrich, R., and Graepel, T. (2009). Match-

box: large scale online bayesian recommendations.

In Proceedings of the 18th WWWW, pages 111–120,

New York, NY, USA. ACM.

Wu, J. and Li, T. (2008). A modiﬁed fuzzy c-means algo-

rithm for collaborative ﬁltering. In Proceedings of the

2nd KDD Workshop on Large-Scale Recommender

Systems and the Netﬂix Prize Competition, pages 1–

4. ACM.

Zhang, M. and Hurley, N. (2008). Avoiding monotony: im-

proving the diversity of recommendation lists. In Pro-

ceedings of the ACM RS, pages 123–130. ACM.

WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies

706