this  limitation  in  coverage  inherited  by  social 
algorithms,  benefiting  in  the  same  time  from  the 
accuracy  of  social-based  recommendations  not 
sufficiently  supported  by  collaborative  filtering 
methods. Similarly, the work introduced in (He and 
Chu,  2010)  show  that  the  collaborative 
recommendation  system  benefits  from  the  social 
annotations and friendships established among users, 
items and tags. Only approaches presented in (Kang 
et al., 2013); (Sohn et al., 2013) use degree centrality 
as  an  SNA  measurement  along  content-based 
filtering with FOAF (Friend of a Friend) ontology to 
compute centrality of each tag, respectively degree of 
importance  of  the  particular  user,  and  that  way 
recommend content.
 
In  (Shokouhi,  2013),  a  personalized  auto-
completion  ranker  is  presented  which  takes  into 
consideration  demographic-based  features,  i.e., 
age,  gender  and  location  extracted  from  Microsoft 
Live  profiles  of  users  when  searching  via  Bing. 
Results on the effectiveness of the ranker before and 
after  personalization  (re-ranking)  show  that 
demographic features significantly  improve  ranking 
when  compared  to  the  (no-reranking)  baseline. 
Utilizing  user-specific  data  for  improved  query 
suggestion by re-ranking the original results obtained 
by traditional ranking approaches is not new and has 
been approached by several studies already. Authors 
in (Wu et al., 2015) employ user generated ratings and 
comments of  books  in Amazon as helpful  metadata 
when  suggesting  social  books  while  searching. 
Further in (Cheng and Cantú-Paz, 2010), a framework 
for the personalization of click models in sponsored 
search is presented which bases on user-specific and 
demographic-based  features  that  reflect  the  click 
behavior of individuals and groups. 
To  the  best  of  our  knowledge,  none  of  these 
existing systems considers users acting as nodes in a 
unimodal  graph  and  their  analysis  with  SNA 
techniques in a collaborative filtering (CF) approach 
to recommend query to a given user. 
3  OUR APPROACH 
Our SNA-based approach of query recommendation 
takes into account some personal attributes of users, 
like home city and gender, as well as their query topic 
or categories (e.g., politics, or sports). Social network 
analysis (SNA) metrics are applied over the generated 
uni-modal user-user network in order to generate the 
similarity matrix. 
 
 
3.1  System Architecture 
 
Figure 1: System architecture. 
In Figure 1, the architecture of our proposed SNA-
based  system  of  query  recommendation  system  is 
depicted. At the input, the system is supplied with the 
following type of data: the user’s social profile data 
(e.g., its gender, and home city) and the query posted 
by the user. Based on input data, a similarity matrix 
is generated which serves to find the most similar user 
to  the  current user.  After this  step,  if  there is  more 
than one concurrent user, ranking of users using SNA 
metrics, either  degree or  authority  centrality is  next 
performed.  Final  step  is  searching  in  query  log  for 
queries  with  most  similar  keywords  to  those 
submitted  by  concurrent  users.  Regarding  query  of 
current user filtering of queries is made using Jaccard 
similarity  coefficient  (Phillips,  2013).  Two  datasets 
have been used in our proposed system. First dataset 
contains data from AOL  search engine during three 
months of 2006. It consists of data about the user id 
in anonym form such as AnnonID (which expected to 
be replaced by real User ID in a future), the posted 
query itself, as  well  as  the  query  time field and  the 
rank field. Second dataset comprises of data gathered 
from  Text  Retrieval  Conference (TREC),  published 
during 2001-2014. Web queries retrieved from TREC 
dataset contain topic of the query along with the co-
clicked query, the actual query, and the clicked URL. 
Data  from  two  datasets  have  been  merged  into  a 
single collection using the matching keyword criteria. 
From  AOL  dataset  one  of  six  available  user’s 
collection of queries have been used in our scenario, 
it  contained  3013956  queries,  while  TREC  dataset 
contains  5980  queries  belonging  to  350  distinct 
topics. Topics from TREC dataset have been further 
categorized  into  8  categories,  according  to  Google 
Trend Search for a better grouping purposes and due 
to  inappropriate  grouping  of  topics  from  AOL 
datasets.  For  instance  some  of  topics  from  AOL 
dataset were: hunger, Chevrolet Trucks and deer, so 
it was necessary to merge these topics (queries) in one 
of eight categories (Lifestyle, Travel & Leisure  and