Social Utilities and Personality Traits for Group Recommendation: A
Pilot User Study
Silvia Rossi and Francesco Cervone
Dipartimento di Ingegneria Elettrica e Tecnologie dell’Informazione,
Universita’ degli Studi di Napoli “Federico II”, Napoli, Italy
Keywords:
Social Choice, Group Recommendation, Big-Five, Group Decision Making.
Abstract:
Recommendations to a group of users can be provided by the aggregation of individual users’ recommenda-
tions using social choice functions. Standard aggregation techniques do not consider the possibility of evalu-
ating social interactions, roles, and influences among group’s members, as well as their personalities, which
are, indeed, crucial factors in the group’s decision-making process. Instead of defining a specific social choice
function to take into account such features, the proposed solution relies on the definition of a utility function,
for each agent, that takes into account other group members’ preferences. Such function models the level of
a user’s altruistic behavior starting from his/her agreeableness personality trait. Once such utility values are
evaluated, the goal is to recommend items that maximize the social welfare. Performance is evaluated with a
pilot user study and compared with respect to Least Misery. Results showed that while for small groups LM
performs slightly better, in the other cases the two methods are comparable.
1 INTRODUCTION
Recommender Systems are powerful tools that pro-
vide suggestions about items for users. The list of
recommended items is the result of a decision process
that aims to understand, for example, which product
a specific user wants to buy, which music to listen,
which movie to see or which news to read. In this
context, the goal of Group Recommender Systems
(GRSs), differently from Individual Recommender
Systems (or simply RSs), is to recommend items to
a whole group of users taking into account all the in-
dividual preferences. The study of GRSs is still an
open research problem. Preferences and tastes of in-
dividuals are collected in the same way of RSs, but the
real problem falls into the correct aggregation of these
preferences in order to recommend items that maxi-
mize global satisfaction, or at least, minimize dissat-
isfaction of all the members.
The problem of aggregating individual prefer-
ences has been widely studied in Mathematics, Eco-
nomics and Multi-agent systems (MAS), with the def-
inition of Social Choice functions. In literature, sev-
eral solutions have been proposed (Masthoff, 2011).
The most common technique is the Average Satis-
faction (AS): it treats all group members as being
equal by averaging preferences (or recommendations)
of all the member in order to produce a final list for
the entire group. Least Misery (LM), instead, cares
about the possible dissatisfaction of some members
by choosing items that minimize it. Anyway, no one
of the standard techniques considers that there are
also other factors that can influence the group deci-
sion. Groups can be dynamic, as well as members’
behaviors depending on situations. Real group deci-
sion making is a complex mechanism that involves re-
lationships among the members, users’ personality, as
well as their experience about the domain of interest.
In this work, we study the influence of individ-
ual users personality in the group decision-making
process. In a realistic scenario, indeed, the person-
alities of the group members can have an impact on
group decisions. For example, there can be people
that rarely change their minds because they believe
that their own decision is the best for everyone, or
simply because they do not want to reduce their util-
ity in favor of others. Other types of people instead,
can be worried about the satisfaction of all the other
members, at the cost of the personal one. Thus, the
latter ones are willing to lose some utility in order
to reach a valid and suitable agreement for the entire
group. In order to consider these elements, it is neces-
sary to study users’ personalities through some mod-
els proposed in human sciences area. One of the most
common is certainly the Five-Factor Model (FFM).
38
Rossi, S. and Cervone, F.
Social Utilities and Personality Traits for Group Recommendation: A Pilot User Study.
DOI: 10.5220/0005709600380046
In Proceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016) - Volume 1, pages 38-46
ISBN: 978-989-758-172-4
Copyright
c
2016 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
According to this, the behavioral features of a person
can be summarized and described through five fac-
tors, called also Big-Five. They are: openness, con-
scientiousness, extraversion, neuroticism and agree-
ableness (Costa and MacCrae, 1992). In particular, in
this work, we focus on the role of the agreeableness
factor in the definition of a utility function that mod-
els altruistic behavior.
In literature, some approaches are starting to
model GRSs that weights users preferences in dif-
ferent ways according to a specific user-related pa-
rameters. For example, in Gartrell et al. (2010) users
preferences are weighted according to their expertise,
while in Rossi et al. (2015) users’ dominance and in-
fluence are taken into consideration. In the proposed
approach, we do not consider weights in the aggre-
gation process, but the utility function used to eval-
uate users rating on individual items takes into ac-
count the whole group preferences and the user agree-
ableness trait. In this sense, our work is related to
the approach of Salehi-Abari and Boutilier (2014),
where individual empathetic utilities are defined tak-
ing into account local relationships with neighbor-
hoods in a social network. However, in Salehi-Abari
and Boutilier (2014) the Authors do not specify how
to evaluate such numerical relationships, while they
focus on computational aspects of scaling up with
large networks of friends. While the role of person-
ality has been addressed before, in literature, to im-
prove the performance of RSs (Nunes and Hu, 2012;
Hu and Pu, 2011), up to our knowledge, this is the
first attempt to introduce personality factors in group
decision making through the use of personality-based
utility functions. The only relevant approach is the
one of Quijano-Sanchez et al. (2013), where the per-
sonality of every individual in the group is evaluated
in terms of conflicting resolution styles. The user’s
ranking of an item is modified by considering all the
couples of users and their mutual influence.
This paper is organized as follows. In Section 2,
we introduce the Five-Factor Model used to evaluate
users’ agreeableness personality trait values, while,
in Section 3, we introduce the utility function used
to model altruistic behaviors and the evaluation, by
personality tests, of parameters that characterize such
function. In Section 4, we present the developed ap-
plication and the metrics used to evaluate with real
users the proposed utility function with respect to
Least Misery (LM). Finally, results are discussed in
Section 5.
2 BIG-FIVES AND
PERSONALITY TRAITS
Research has shown that personality is a primary fac-
tor which influences human behaviors. The Five-
Factor Model (FFM) describes human personality
using five factors, also known as Big-Five (McCrae
and Costa, 1987). Neuroticism represents an emo-
tional instability characterized by negative emotions
like fear, anger, sadness and low self-esteem (Mc-
Crae and Costa, 1989). High neuroticism level peo-
ple rarely control their impulses and cope the stress
(McCrae and Costa, 1985). Extraversion is an indica-
tor of assertiveness and trust. Extravert people easily
create interpersonal relationships and love working
and being together with others (Costa and McCrae,
1995). Agreeableness describes the level of sympa-
thy, availability and cooperativeness. People with low
level of this factor are competitive, skeptics and an-
tagonistic. It measures how much a person is nice and
altruist (Costa and McCrae, 1995). Openness repre-
sents the inclination to openness to new experiences,
having an active imagination and a preference about
the will to find new ideas (Costa and McCrae, 1995).
Closed people are less flexible and rarely understand
others’ point of views. Finally, conscientiousness de-
scribes how much an individual is responsible, disci-
plined and dutiful (McCrae and Costa, 1989).
Several studies also demonstrated that personality
traits have an impact in group decision making. An-
derson et al. (2001) tried to correlate Big-Five with
the status
1
of college students. A student with a high
level of status implies a considerable level of atten-
tion by others towards him. Furthermore, status re-
lates to respect and influence in the social context.
Authors measured students’ status through an assess-
ment made by the student themselves and according
to the positions that they play in the college organiza-
tion. Results showed that the extraversion has a pos-
itive correlation with the status; conversely, neuroti-
cism has a negative correlation, but only for men. The
latter result can be explained because Brody (2000)
said that sadness, depression, fear, shame and embar-
rassment (neuroticism features) are viewed as “un-
manly”. Thus, men who show these emotions are
evaluated more negatively with respect to women. For
the latter, these emotions are considered ordinary.
Sager and Gastil (2006) studied the impact of Big-
Five in producing a supportive communication. The
latter is opposed to defensive communication in which
group members see others as a threat and try to stay
one step ahead them. In supportive communication
1
Status describes the role that a person has in a social
group.
Social Utilities and Personality Traits for Group Recommendation: A Pilot User Study
39
instead, all the members allow others to express their
own opinions and consider other choices. Extraver-
sion, agreeableness and openness has a positive cor-
relation in this context.
Ma (2005) studied the influence of personality in
conflict situations that happen during negotiations.
He correlates conflict resolution styles, which rep-
resent different people reactions in case of conflict,
with big ve by using two factors: the assertiveness,
which describes how much an individual try to satisfy
his needs, and cooperativeness, which suggests the
level of collaboration and the intention of maximizing
others’ utilities. Results showed that: neuroticism is
negatively correlated to compromising; extraversion
is positively correlated to competing and collaborat-
ing; agreeableness is negatively correlated to compet-
ing and positively to compromising; finally, consci-
entiousness and openness have no significant correla-
tions.
Starting from the previous considerations, in this
work, we decided to move a first step by considering
only a single personality trait at the time starting from
the agreeableness trait, while leaving the combination
with the others as a future work. We decided not to
rely on the evaluation of neuroticism, which has neg-
ative correlations with status (a neurotic person could
be excluded during a decision process). Conversely,
extraverts could be viewed as leaders, but at the same
time they are more competitive than collaborative: all
of those are qualities that are suitable to our case (the
choice of a movie), but already considered in related
works. A conscientious person could have a positive
impact when deciding important decision, but we do
not believe it is significant for our specific goal. This
is the same for open individuals which could agree to
see a movie not suitable with respect to their prefer-
ences, but whose trait could be more difficult to model
with respect to the agreeableness factor.
We believe that in choosing a movie in a group of
close friends, agreeableness (that is related to altruis-
tic behavior (Costa and McCrae, 1995)) plays an im-
portant role. It is obvious that, unless all the compo-
nents have the same movie tastes, someone will have
to give up their desires to see a movie which others do
not like. Therefore, agreeable people will make com-
promises and, for this reason, we decided to analyze
the impact of this factor in a GRS. People with high a
value of agreeableness, just because of its descriptive
characteristic, are more altruistic: basically, it means
that this type of people care about the satisfaction of
the entire group, or are more willing to compromise
in order to obtain a solution that works for the whole
group.
3 AN UTILITY FUNCTION FOR
AGREEABLENESS
Generally speaking, the aim of a Recommendation
System (RS) is to predict the relevance and the im-
portance of items (for example movies, restaurants
and so on) that the user never evaluated. More for-
mally, given a set of n users (U = {1, . . . , n}) and a set
of m items (M = {1, . . . , m}), in our domain applica-
tion it is a set of movie, an individual recommendation
system, for each user i, aims at building a Preference
Profile of the user i over the complete set M, start-
ing form some initial ratings (typically a value from
1 to 5) each user provides on some elements of M.
Once each user i U has a preference profile
i
over
M (
i
= {x
i,1
, . . . , x
i,m
}) with x
i, j
R , which repre-
sents the user i rank (as directly expressed by the user
i or evaluated by the system) for the j movie, the goal
of a GRS is to obtain
U
= {x
U,1
, . . . , x
U,m
}, where
x
U, j
is the correspondent ranking for the j movie, as
evaluated for the group. Typically, this is obtained by
implementing a social choice function SC :
n
→
U
, that aggregates all the preferences profiles in
U
=
{x
U,1
, . . . , x
U,m
}. Note that our goal is not to guess the
exact value of x
U, j
the whole group would assign to
the movie j, but to properly select the k-best movie in
the group preference profile (the ones with the highest
rating) and suggest them to the group.
In this work, we propose to consider the agree-
ableness factor in the process of building recommen-
dations for groups. Instead of defining a specific so-
cial choice function that considers such factor, the
proposed solution relies on the definition of an in-
dividual utility function (to build up the preference
profile) to evaluate the rating of items for each user,
that takes into account the whole group. Such utility
function could be interpreted as “the user satisfaction
if the recommender system chooses that item for the
group”. For the utility function, we used a model de-
veloped by Charness and Rabin (2002).
Let us consider a group of n users, which have a
rating for each element j of the domain (e.g., an ex-
pected rating provided by an individual RS x
i j
). The
utility function is composed by two terms; the first
concerning a “disinterested” social-welfare criterion,
defined as follows:
W (x
1, j
, x
2, j
, ..., x
n, j
) = δ · min(x
1, j
, x
2, j
, ..., x
n, j
)+
(1 δ) · (x
1, j
+ x
2, j
+ ... + x
n, j
)
(1)
where δ [0, 1], x
1, j
, x
2, j
, ..., x
n, j
are called payoffs
and they are the individual rating of each of the n
users of the j-th item. The aim of the first addend is
reducing inequity (by helping the worst-off person);
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
40
indeed, this value increases proportionally to the min-
imum payoff. This factor is, basically, a generaliza-
tion of LM technique just because utility is given by
the minimum satisfaction among the users. The sec-
ond addend is responsible for the maximization of the
social welfare and increases proportionally to the sum
of the individual payoffs. Setting δ = 1, indeed, the
function cares only about inequity, just like LM. Set-
ting δ = 0 instead, the function focuses on global sat-
isfaction.
The utility of user i about the item j is a weighted
sum of the disinterested social-welfare W and its own
payoff defined as follows:
U
i, j
(x
1, j
, x
2, j
, ..., x
n, j
) = (1 λ
i
) · x
i, j
+
λ
i
·W (x
1, j
, x
2, j
, ..., x
n, j
)
(2)
where λ
i
[0, 1] means how much a person pursues its
own interest or the social welfare. Setting λ
i
= 0 user
takes care only of his personal interest. λ
i
= 1 repre-
sents the classical disinterested behavior, of him/her,
who does not take part in the group decision, mov-
ing the weight of the function on the entire group
satisfaction (including its own). Hence, according to
Charness and Rabin (2002), the considered function
evaluates both the personal and the group satisfaction,
depending on an altruistic factor λ
i
. Once the new
items utilities are evaluated, the goal is to recommend
movies that maximize the social welfare.
Since the agreeableness factor is positively cor-
related with an altruistic behavior, here, our goal is
to calibrate the λ
i
parameter with respect to the indi-
vidual agreeableness levels, so that the most altruistic
have a high value and viceversa.
3.1 Personality Test
Personality can be acquired in both explicit and im-
plicit ways (Dunn et al., 2009). The former measures
a user’s personality by asking the user to answer a list
of designed personality questions. These personality
evaluation questionnaires have been well established
in the psychology field (Gosling et al., 2003). The
implicit approaches acquire user information by ob-
serving users’ behavioral patterns. Typically, explicit
personality acquisition interface are preferred (Dunn
et al., 2009) by the user. However, implicit meth-
ods require less effort from users. In our study, we
adopted the explicit way to measure users’ personal-
ity, because the evaluation of a single personality trait
would require using only a small set of questions, and
so a minimum required effort from the user.
There are several questionnaires to predict a per-
son’s Big-Five factors. Some of these consist of a
lot of questions, in certain cases some hundreds. Too
Table 1: Mini IPIP questionnaire for agreeableness evalua-
tion.
# Text
1 Sympathize with others’ feelings.
2 Am not interested in other people’s problems. (R)
3 Feel others’ emotions.
4 Am not really interested in others. (R)
(R) = Reverse Scored Item.
long questionnaires can reduce the people’s attention
level. A lot of research groups tried to cut the number
of questions preserving the accuracy. The most fa-
mous small questionnaires are NEO-FFI (NEO Five
Factor Inventory) composed by 60 questions (Costa
and MacCrae, 1992), the 50-item IPIP-FFM (Inter-
national Personality Item Pool - Five Factor Model)
(Goldberg, 1992), the 44-item BFI (Big Five In-
ventory), the TIPI (Ten-Item Personality Inventory)
(Gosling et al., 2003) and, finally, the Mini-IPIP
(Mini International Personality Item Pool) (Donnel-
lan et al., 2006). We chose the last because it is very
short, but at the same time effective. It consists of
20 questions, 4 for each personality factor. Since our
focus only on the evaluation of the user agreeable-
ness, we extracted the 4 related questions (see Table
1; questions should be read in first person). The an-
swer of each question can be a number from 1 to 5,
where 1 means “very inaccurate” and 5 “very accu-
rate”.
3.2 Agreeableness and Group
Recommendation
Let us assume that each user completed the agree-
ableness evaluation test. Generic user i has a personal
value α
i
[1, 5], obtained by averaging individual an-
swer values. Since the λ
i
parameter of Equation 2
should belong to the range [0, 1] and should depend
directly from α
i
, we defined it as follows:
λ
i
=
α
i
1
4
. (3)
This way, when α
i
= 5 (high level of agreeable-
ness), λ
i
= 1, so the user cares only about the group
satisfaction. When α
i
= 1, instead, λ
i
= 0, conse-
quently, user’s utility depends only on his personal
satisfaction. As we already explained, in the Equa-
tion 1 there is the δ parameter that deals with weight-
ing inequity aversion against global satisfaction. We
set it equal to 0.5 for giving the same importance to
both goals. Utility function 2 refers to the utility of a
particular item for a specific user. Thus, preferences
x
1, j
, x
2, j
, ..., x
n, j
stand for the rating predictions of a
generic item for the n members of the group.
Social Utilities and Personality Traits for Group Recommendation: A Pilot User Study
41
To build recommendations for the groups, we
chose the merging recommendations technique.
Firstly, the system creates a list of 10 items L
i
for each
user i evaluated by an RS. Later, it merges all the lists
in a single one which we call L:
L =
[
iU
L
i
(4)
where, G is the set of members of the group. Our GRS
computes for each user the rating predictions of all the
items in L. The latter are exactly the x
1, j
, x
2, j
, ..., x
n, j
parameters of Equation 2 (where n = |U | and j is the
movie). U
i, j
for each user i and for each item j is the
computed. x
U, j
denotes the utility of the group if the
GRS chooses item j defined as follows:
x
U, j
=
iU
U
i, j
. (5)
Our goal is to maximize the social welfare, in-
deed, system will recommend the 10 items with the
highest x
U, j
value.
4 EXPERIMENTAL STUDY
To conduct our experiments, we developed a client-
server application. Client was an Android
2
app and
the server was developed in Java
3
using Spring frame-
work
4
hosted by Tomcat Servlet Engine
5
.
4.1 Movie Recommendation Server
We built a RESTful web service JSON-based in
order to communicate with the Android app. We
adopted the Apache Mahout library
6
to predict
the user ratings, and chosen the MovieTweetings
(Dooms et al., 2013) dataset to train the system
and to populate the Ratings Repository. Movi-
eTweetings consists of movie ratings contained in
well-structured tweets on the Twitter.com social
network. This information is contained in three
files: users.dat, ratings.dat and movies.dat, which
provide, respectively, the user identification number,
his/her associated ratings and a list of movies. The
dataset is updated every day, therefore, its size is
constantly changing. At the last access, it contained
about 35000 users, 360000 ratings and 20000 movies.
2
https://www.android.com
3
https://www.java.com/
4
https://spring.io
5
http://tomcat.apache.org
6
http://mahout.apache.org/
The recommendation engine provides rating pre-
dictions when the recommendation API is invoked.
To achieve this goal, we used item-based City Block
distance, also known as Manhattan distance. In Ma-
hout implementation, the generic movie j is repre-
sented by a boolean vector:
j = [x
1, j
, x
2, j
, ..., x
k, j
], (6)
where k is the number of users in the dataset and
x
i, j
= 1 if user i rated the movie j. The distance be-
tween two movies rated by user i is the sum of the
absolute value of the differences of the two associ-
ated vector components. More formally, the distance
between items j and h is:
d( j, h) =
k
i=1
|x
i, j
x
i,h
|. (7)
4.2 Android Application
An Android application was developed in order to
gather the information needed by the server to provide
recommendations to single users. In order to simplify
the operations, the experiment consists in some se-
quential steps so that each phase unlocks the next one.
The first duty for the user, when he/she accesses
the application, is to sign up/sign into the system.
The user signs up to the system by entering username,
password, gender, age, and education level. When the
interaction starts, users have first to provide a certain
number of movie ratings (at least for 20 movies), with
a value in the range [1, 10] (see Figure 1-left) in order
to define their profile. The user is provided with an in-
terface to get movie lists and to store movies ratings.
If a user is in the training stage, he/she can browse
movies by ordering them by most rated or best rated,
or searching for a specific movie (filtering by genre or
title).
Once the user rated twenty movies, the app au-
tomatically shows the personality questionnaire (see
Figure 1-right). It consists of four questions, as previ-
ously explained.
After this first stage, a user can get movie rec-
ommendations from the server. When the server
gets the recommendation request, once calculated the
best movies for the user, it retrieves additional de-
tails about the film, like, for example, the director,
writers, actors and genres using OMDb
7
web service.
Fortunately, MovieTweetings data set stores, for each
movie, its IMDb id, which can be used to address the
OMDb service. The Android application shows on
the screen the recommendations for the user through
textual and graphical descriptions.
7
http://www.omdbapi.com - The Open Movie Database
is a free web service to obtain movie information.
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
42
Figure 1: Movie rating interface (left) and first question of
personality test (right).
4.3 Methodology
The design of this study is a within-subjects, coun-
terbalanced, repeated measures experiment. The goal
of our study is to compare the proposed technique,
described in the previous paragraphs, with respect
to the Least Misery. We selected the latter because
it achieves good performance especially for small
groups (O’connor et al., 2001).
Once the questionnaire is completed, a user can
start the test by completing the following steps.
Firstly, a user creates a group giving it a name, and
adds in it one or more members using their user-
names. The system, then, recommends a list of 10
movies (see Figure 2-left). In order to generate the
list of 10 items for the group, for each technique (the
used utility function and LM) the GRS recommends
10 movies ordered in terms of their ratings. To merge
them in one list of ten items, we developed an iter-
ative algorithm that, at each step, adds an item from
each list (starting from the items with the highest rat-
ing) in the output set. If the current item is already
in the set, the algorithm skips to the next one, and so
on. From this list, the group has to collectively choose
three movies that they would like to see together (see
Figure 2-left). Finally, the group has to sort the three
selected movies (see Figure 2-right) in their joint pref-
erence order.
4.4 Evaluation Metrics
Since groups select the best 3 from a 10 movies set, in
order to evaluate and compare our method with LM,
Figure 2: Recommended movies (left) and sorting page
(right).
we considered the following metrics.
precision@3: is the ratio between the number of
movies guessed by the GRS (using a specific method)
and the sum of the latter and the remaining movies
(only considering the first three movies). If G is the
set of the groups that participated to the experiment,
I
g
and P
g
represents, respectively, the 3 movies se-
lected by group g and the 3 movies with the highest
prediction, then:
precision@3 =
1
|G|
gG
|I
g
P
g
|
|I
g
P
g
| + (3 |I
g
P
g
|)
(8)
nDCG@3: evaluates the ranking of predicted
movies with respect to the real ranking j chosen by
groups.
nDCG@3 =
1
|G|
gG
3
j=1
rel
gi
j
max(1, log
2
j)
(9)
where,
rel
gi
j
=
(
1 if i
j
I
g
0 otherwise
(10)
x success@3 , for each group, is 1 if the algorithm
guessed at least x movies in the 3 selected by the
group. With 1 x 3:
x success@3 =
1
|G|
gG
x success@3
g
(11)
Social Utilities and Personality Traits for Group Recommendation: A Pilot User Study
43
where,
x success@3
g
=
(
1 if |I
g
P
g
| x
0 otherewise
(12)
5 RESULT ANALYSIS
Experiments lasted about two weeks and, as summa-
rized in Table 2, we recruited 68 users (48 groups)
with an average age of 27 years, the most of whom
were students.
Table 2: User stats.
Number of users 68
Average age 26.9
Minimum age 14
Maximum age 55
Males 45
Females 23
Middle school 1
High school 9
Bachelor students - undergraduate 26
Bachelor students - graduate 11
Master students - undergraduate 11
Master students - graduates 10
Group members were directly selected by one of
the users, as described above, and their intersection is
not necessarily empty (e.g., some users joined more
than one group). A single group was allowed to join
the test only for one time. The number of considered
group was 48 with an average number of members
equals to 2.6. In Table 3, we reported the number of
groups considered in the experiments for each group
dimension.
Table 3: Group stats.
# mebmers Amount
2 22
3 23
4 2
5 1
Total 48
When comparing the two techniques, we analyzed
their results separately for the groups of dimension
two and for the groups of more than two members.
The main reason of this choice is that about the half
of the groups were composed by two members (see
Table 3) and it is known that LM shows the best per-
formances in this case.
precision@3. Results of precision@3 are summa-
rized in Figure 3. From the charts, we can see the
better performance of LM on two members groups. It
is not a surprise, because LM excels in cases like this.
ANOVA test confirms that difference between the two
techniques is significant (F = 3.076, p-value= 0.09).
Regarding groups with more than two members,
once again LM is better than our technique, but dif-
ference, in this case, is too small to be significant for
ANOVA test (F = 0.11, p-value= 0.74).
Figure 3: precision@3.
nDCG@3. As for the previous metrics, LM over-
comes the proposed technique (see Figure 4), but
not enough according ANOVA (F = 1.547, p-value=
0.22 for the two members groups and F = 0.056, p-
value = 0.81 for the others).
Figure 4: ndcg@3.
1 success@3. We recall that 1 success@3 counts
the number of times that the algorithm guessed at
least one movie among the three selected by the
group. LM is always better than our technique
(see Figure 5), but these results provide significant
differences only when considering all the 48 groups
together (F = 3.847, p-value = 0.05).
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
44
Figure 5: 1 success@3.
2 success@3. In this case, our method was more ac-
curate in groups with more than two members (see
Figure 6). Unfortunately ANOVA shows that these
differences are due to chance (F = 0.305, p-value
= 0.58).
Figure 6: 2 success@3.
3 success@3. It is really rare that a technique is
able to guess exactly the three movies chosen by the
group. However, in some cases, it happened. Once a
time LM wins but with no significant differences.
5.1 Discussion
Groups recommendation systems lack the appropriate
dataset to be evaluated upon. On the contrary, user
studies have the disadvantages of being expensive to
conduct both in term of recruiting the proper users
and engaging them, when they are volunteers. In this
sense, founding the appropriate number of groups and
with varying dimensions is challenging.
As expected, for two users groups LM is the best
choice. In the other cases, we cannot say the same
thing since the two methods are comparable and show
a similar performance. Pearson correlation, evaluated
on results distributions shows that the two techniques
are not linearly dependent because, in most cases, its
value is near zero. This result means that, in some cir-
cumstances, the proposed utility function shows a bet-
ter performance than LM and in other cases the oppo-
site occurs. Therefore, in certain groups, users do care
to minimize the group unsatisfaction (i.e., the LM
goal) regardless of the agreeableness value. Since we
fixed the δ value to 0.5, with the aim to give the same
weight to least misery and social welfare components
of Equation 1, a better tuning of this parameter would
results in choices that differentiate most from LM.
Moreover, we think these first results are effected
by how we conducted the experiments. As already
said, the 10 movies recommended to the groups are
obtained by a sort of union of two lists independently
generated by the two different techniques. Thus, in
this set, there are movies chosen by both LM and the
proposed utility function. Maybe, a between-subject
experiment (by assigning the result of a single tech-
nique to each group) would lead us to different re-
sults. Moreover, the low number of experiments had
a relevant impact in the significance analysis of the re-
sults. Finally, since most of the groups was composed
by two or three members, we foresee that, with larger
groups, our technique could have obtained better re-
sults.
6 CONCLUSIONS AND FUTURE
WORK
In this work, we introduced a new method to predict
which items are suitable for groups of users, taking
into account users’ personality. In particular, we eval-
uated the role of the agreeableness factor (e.g., one
of the features of FFM), in order to weigh the impor-
tance of user’s gain with respect to the global satisfac-
tion.
To evaluate this approach, we conducted a pi-
lot user study on movie recommendations, where we
compare the results of the proposed approach with re-
spect to a Least Misery strategy (LM). Results showed
that for small groups a LM performs slightly better.
In particular, for two people’s groups LM is the best
choice; in the other cases the two methods are com-
parable and show a similar performance. We fore-
see, that our utility function will improve its effec-
tiveness proportionally to the group size: the larger is
the group, the greater will count altruism in the final
decision. Hence, in future works, we should try to en-
courage users to create larger groups in order to better
support our hypothesis.
Finally, we could study how to use other person-
ality factors to build another utility function. We saw
Social Utilities and Personality Traits for Group Recommendation: A Pilot User Study
45
that extraversion is correlated to the leadership of a
group, so in a decision process, it could be very criti-
cal. Furthermore, even openness could be decisive in
such cases, because it can have the same weight of the
agreeableness in our function. Open people are glad
to try new experiences, so they could agree to view a
movie for which the recommender system predicts a
low value for them, but the opposite for other mem-
bers.
REFERENCES
Anderson, C., John, O. P., Keltner, D., and Kring, A. M.
(2001). Who attains social status? effects of personal-
ity and physical attractiveness in social groups. Jour-
nal of personality and social psychology, 81(1):116.
Brody, L. R. (2000). The socialization of gender differences
in emotional expression: Display rules, infant temper-
ament, and differentiation. Gender and emotion: So-
cial psychological perspectives, pages 24–47.
Charness, G. and Rabin, M. (2002). Understanding social
preferences with simple tests. Quarterly journal of
Economics, pages 817–869.
Costa, P. T. and MacCrae, R. R. (1992). Revised NEO Per-
sonality Inventory (NEO PI-R) and NEO Five-Factor
Inventory (NEO FFI): Professional Manual. Psycho-
logical Assessment Resources.
Costa, P. T. and McCrae, R. R. (1995). Primary traits
of eysenck’s pen system: three-and five-factor solu-
tions. Journal of personality and social psychology,
69(2):308.
Donnellan, M. B., Oswald, F. L., Baird, B. M., and Lucas,
R. E. (2006). The mini-ipip scales: tiny-yet-effective
measures of the big five factors of personality. Psy-
chological assessment, 18(2):192.
Dooms, S., De Pessemier, T., and Martens, L. (2013). Movi-
etweetings: a movie rating dataset collected from
twitter. In Workshop on Crowdsourcing and Human
Computation for Recommender Systems, CrowdRec at
RecSys 2013.
Dunn, G., Wiersema, J., Ham, J., and Aroyo, L. (2009).
Evaluating interface variants on personality acquisi-
tion for recommender systems. In Houben, G.-J.,
McCalla, G., Pianesi, F., and Zancanaro, M., edi-
tors, User Modeling, Adaptation, and Personalization,
volume 5535 of Lecture Notes in Computer Science,
pages 259–270. Springer Berlin Heidelberg.
Gartrell, M., Xing, X., Lv, Q., Beach, A., Han, R., Mishra,
S., and Seada, K. (2010). Enhancing group recom-
mendation by incorporating social relationship inter-
actions. In Proc. of the 16th ACM International Con-
ference on Supporting Group Work, pages 97–106.
ACM.
Goldberg, L. R. (1992). The development of markers for the
big-five factor structure. Psychological assessment,
4(1):26.
Gosling, S. D., Rentfrow, P. J., and Swann, W. B. (2003).
A very brief measure of the big-five personality do-
mains. Journal of Research in personality, 37(6):504–
528.
Hu, R. and Pu, P. (2011). Enhancing collaborative filtering
systems with personality information. In Proceedings
of the Fifth ACM Conference on Recommender Sys-
tems, RecSys ’11, pages 197–204. ACM.
Ma, Z. (2005). Exploring the relationships between the big
five personality factors, conflict styles, and bargaining
behaviors. In IACM 18th Annual Conference.
Masthoff, J. (2011). Group recommender systems: Com-
bining individual models. In Recommender Systems
Handbook, pages 677–702.
McCrae, R. R. and Costa, P. T. (1985). Updating nor-
man’s” adequacy taxonomy”: Intelligence and per-
sonality dimensions in natural language and in ques-
tionnaires. Journal of personality and social psychol-
ogy, 49(3):710.
McCrae, R. R. and Costa, P. T. (1987). Validation of the
five-factor model of personality across instruments
and observers. Journal of personality and social psy-
chology, 52(1):81.
McCrae, R. R. and Costa, P. T. (1989). The structure of in-
terpersonal traits: Wiggins’s circumplex and the ve-
factor model. Journal of personality and social psy-
chology, 56(4):586.
Nunes, M. A. S. and Hu, R. (2012). Personality-based rec-
ommender systems: An overview. In Proceedings of
the Sixth ACM Conference on Recommender Systems,
RecSys ’12, pages 5–6, New York, NY, USA. ACM.
O’connor, M., Cosley, D., Konstan, J. A., and Riedl, J.
(2001). Polylens: a recommender system for groups
of users. In ECSCW 2001, pages 199–218. Springer.
Quijano-Sanchez, L., Recio-Garcia, J. A., Diaz-Agudo, B.,
and Jimenez-Diaz, G. (2013). Social factors in group
recommender systems. ACM Trans. Intell. Syst. Tech-
nol., 4(1):1–30.
Rossi, S., Caso, A., and Barile, F. (2015). Combining
users and items rankings for group decision support.
In Trends in Practical Applications of Agents, Multi-
Agent Systems and Sustainability, volume 372 of Ad-
vances in Intelligent Systems and Computing, pages
151–158. Springer International Publishing.
Sager, K. L. and Gastil, J. (2006). The origins and con-
sequences of consensus decision making: A test of
the social consensus model. Southern Communication
Journal, 71(1):1–24.
Salehi-Abari, A. and Boutilier, C. (2014). Empathetic social
choice on social networks. In 13th International Con-
ference on Autonomous Agents and Multiagent Sys-
tems, pages 693–700.
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
46