PREDICTING USER ACTIVITIES IN THE SEQUENCE OF
MOBILE CONTEXT FOR AMBIENT INTELLIGENCE
ENVIRONMENT USING DYNAMIC BAYESIAN NETWORK
Han-Saem Park and Sung-Bae Cho
Dept. of Computer Science, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, Seoul, Korea
Keywords: User activity prediction, Dynamic Bayesian network, Mobile context.
Abstract: Recently, mobile devices became essential mediums in order to implement ambient intelligence. Since
people can always keep these mobile devices, it is easy for them to collect diverse user information.
Therefore, many research groups have attempted to provide useful services based on this ubiquitous
information. This paper proposes a method to predict user activity in the sequence of mobile context. In
order to conduct accurate prediction of activity among various patterns, we have considered user activity,
place, time and day of week as mobile context. We have used dynamic Bayesian network to model the user
activity patterns with this context, and learned the model of each individual to obtain better model. For
experiments, we have collected the mobile logs of undergraduate students, and confirmed that the proposed
method produced good performance.
1 INTRODUCTION
Ubiquitous sensors, devices, networks and
information are essential infrastructure to implement
ambient intelligence to provide diverse and relevant
services to people (Cai et al., 2009). Recently,
Mobile devices such as smart phone and PDA
became essential mediums in order to provide
intelligent services in these ambient intelligence
environments because people always keep them.
Besides, it becomes so popularized that most people
can have and use them. Since most of them have
several functions and sensors including camera, GPS
and MP3 players, it also can provide a lot of user
information. Accordingly, many research groups
attempt to store and manage users' information of
daily life or provide diverse smart services to users
in real world (Silva et al., 2005, Gemmell et al.,
2009).
The most basic mobile log information is a user
location from GPS, and LBS (Location-based
Services) based on this information has been a
promising research topic (Bellavista et al., 2008). It
also includes commercial services although they use
very simple method like rule-based system. In
addition to the user location, various context
information including time, environment, user and
device has been used to infer higher-level context
(Korpipaa et al., 2003). Some of them analyze these
mobile logs and show them to users as an interesting
way, and AniDiary, which provided a cartoon-style
summarization of users' daily life, can be a
representative example (Cho et al., 2007).
This paper attempts to predict user activity using
mobile life log. If we can predict the activities in the
future accurately, it is possible to provide
information that user requires. One of the hardest
difficulties in predicting user activity is that there are
so many possible activities that we cannot conduct
the accurate prediction. Only recognizing current
activities has been a popular research topic (Ermes,
et al., 2008), and predicting trajectory not activities
was conducted in a campus domain (Han et al.,
2006).
This paper learned the sequence of user activity
pattern using dynamic Bayesian network. To solve
the problem mentioned before in the learning
process, we have considered information such as
place, time, day of week, call record, MP3, SMS and
photo in addition to activity. Also, we learned the
prediction model with an individual user's data as
well as integrated data for all users since each one
can have a different activity pattern.
311
Park H. and Cho S. (2010).
PREDICTING USER ACTIVITIES IN THE SEQUENCE OF MOBILE CONTEXT FOR AMBIENT INTELLIGENCE ENVIRONMENT USING DYNAMIC
BAYESIAN NETWORK.
In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Artificial Intelligence, pages 311-316
DOI: 10.5220/0002727003110316
Copyright
c
SciTePress
2 BACKGROUNDS
2.1 Bayesian Networks and Mobile
Context
Normally, context is closely related to user
activities. For example, place restricts the possible
range of activities. Dey defined context as any
information that can be used to characterize the
situation of an entity such as a person, place, or
object that is considered relevant to the interaction
between a user and an application, including the user
and the application themselves (Dey, 2001).
Recently, the word "mobile context" has started to
be used as a popularization of the mobile devices.
Since mobile environment is under uncertainty, the
mobile context includes much uncertain information.
Bayesian networks have been used as methods to
model context, and they provide reliable
performance with uncertain and incomplete
information (Kleiter, 1996). It can be modeled using
the data and can be designed using expert
knowledge, and has been used to classify and infer
the mobile context based on these strengths.
Korpipaa et al. in VTT research center utilized naive
Bayes classifier to learn and classify the user's
context in mobile environment (Korpipaa et al.,
2003). Microsoft Research proposed the system that
inferred what the user concentrated in a certain time
in an uncertain mobile environment (Horvitz et al.,
2003). Dynamic Bayesian networks extend Bayesian
networks to model problem in a sequential domain.
2.2 Classification of Activity and Place
Before predicting activities, it is important to
classify main variables such as activity and place
with the proper criterion. In this paper, we have
referred GSS (General Social Survey on Time Use)
for activity classification (http://www.statcan.ca/)
and NHAPS (National Human Activity Pattern
Survey) for place classification (Klepeis et al.,
2001).
GSS, a survey conducted by Statistics Canada,
was to gather data on social trends to monitor
changes in the living conditions over time, and the
classification was conducted in a practical
perspective. It divides all activities into ten
categories first, and each category is subdivided into
several subcategories as the characteristics. It has 3-
level hierarchy, and the total number of activity is
177. Ten main categories are "Paid work and related
activities," "Household work and related activities,"
"Social support, civic and voluntary activity,"
"Education and related activities," "Socializing,"
"Television, reading and other passive leisure,"
"Sports, movies and other entertainment events,"
"Active leisure," and "Residual."
NHAPS is a survey conducted on 1992 ~ 1994.
It contains the patterns of human activity and place
during 24 hours (Klepeis et al., 2001). We used the
place classification in this survey, and they are eight
including "Traveling/Near vehicle (Outdoor),"
"School/Church/Hospital/Public building,"
"Residence (Indoor)," "Traveling inside vehicle,"
"Bar or Restaurant," "Mall/Grocery store/Other
store," "Other outdoor," and "Other indoor." It also
provides the activity classification, but we do not use
it because it is too out-dated.
3 PROPOSED METHOD
This paper has used the sequence of context
collected from mobile device as input to predict the
user activity. Fig. 1 illustrates an overview of the
proposed prediction method. There are two phases:
Prediction and modeling. The flow in the left side is
for prediction, and the other one is for modeling.
Figure 1: An overview of the proposed prediction method.
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
312
For modeling, we, first, collected the mobile
context of activity, place, time, day of week, call
record, MP3, SMS and photo. Activity and place are
manually annotated by user, and call record, MP3,
SMS and photo are binary attributes which have a
state of "Yes" or "No." In a preprocessing step, we
preprocessed the collected data, which included
attribute selection, data sampling and attribute
discretization. After that, we learned the prediction
model (dynamic Bayesian network) of each user,
and the model of all users was also modeled for a
new user.
In prediction phase, after mobile context (log) of
user is input, the system clustered the users together
with contexts and profiles of other users. In the
assigned group model, the prediction is conducted.
In this paper, we assumed one group (cluster)
because the data were collected from a group of
college students, not several groups of users.
3.1 Data Preprocessing
Preprocessing step includes data sampling, attribute
selection and attribute discretization. As explained,
we used seven mobile contexts of activity, place,
time, day of week, call record, MP3, SMS and
photo. We called them an attribute of context. They
were collected on every minute, and we call these
stored contexts of sequence data of context.
Data stored on every minute were sampled on
every hour. Since most of the activities last more
than an hour, data stored on a minute or several
minutes are outliers. If there were more than one
activity in an hour, one occupied the most time was
selected as an activity for that hour.
Among seven attributes, we selected activity,
place, time and day (of week) considering their
influence to the next activity. It can be checked with
the data using maximum likelihood estimation
method. Fig. 2 shows the influence of each attribute
to the activity to be predicted according to the
number of sequence. Thicker edge means more
significant influence, and the number of sequence
means the number of time point including the
activity to be predicted. In this figure, we can find
out that four activities of activity, place, time and
day are more significant than others, and recent
attributes are more significant than old ones.
Since Bayesian network requires discretized
input, we discretized the context attributes. To
discretize an activity, we modified the GSS
hierarchy described in section 2.2. Original GSS is
for general people, but we modified it to specific
user groups. Figure 3 demonstrates this concept. We
(a) Number of sequence: 2
(b) Number of sequence: 3
(c) Number of sequence: 4
(d) Number of sequence: 5
Figure 2: Influence of attributes to the activity to be
predicted according to the number of sequences (the
number following attribute name means the sequential
order: 0 represents the first).
used 10 main categories of original GSS as shown in
the left in Fig. 3, but modified the second
hierarchical categories and activities in the last
hierarchy for college students because one in
original GSS is sometimes classified too much for
PREDICTING USER ACTIVITIES IN THE SEQUENCE OF MOBILE CONTEXT FOR AMBIENT INTELLIGENCE
ENVIRONMENT USING DYNAMIC BAYESIAN NETWORK
313
general users but sometimes classified not enough
for specific user group like college students.
Analyzing the collected data and survey from
college students, we modified the right side in Fig. 3
of GSS. Fig. 3 illustrates an example. The category
“Socializing” is divided into “Restaurant meals,”
“Socializing at home,” and “Other socializing,” and
“Restaurant meals” is subdivided into five activities.
Figure 3: Modified GSS.
3.2 Learning Activity Prediction Model
with Dynamic Bayesian Network
A Bayesian network (BN), associated with a set of
random variables Z = (Z1, Z2, … , ZN), is a pair: B
= (G, θ) where G is a structure and θ represents the
parameters encoding conditional probabilities. The
structure of BN is represented as a directed acyclic
graph (DAG) where the nodes correspond to the
variables Zi in Z and edges between nodes
correspond to the conditional dependencies. The
parameters of BN are represented as the conditional
probability table when nodes are discrete (Kleiter,
1996). Conditional probability distribution of each
variable in BN is calculated as Eq. (1).
))(|(),...,(
1
1 i
N
i
iN
ZPaZPZZP
=
=
(1)
where P
a
(Z
i
) denotes the parents of Z
i
.
Dynamic Bayesian network is an extension of
Bayesian network to model probability distributions
over sequential random variables (Murphy, 2002).
Normally, dynamic Bayesian network assumes that
the parameters of the conditional probability
distributions are time-invariant; the model is
homogeneous in time. It also assumes Markov
property where the conditional probability
distribution of future states depends only on the
present state and not on any past states given the
present state and all past states. That is, the current
probabilistic variable depends only on previous N
and more recent steps. In inference, therefore, with
the new network, which is “unrolled” by N steps, the
conditional probability distribution of each variable
is calculated as Eq. (1).
The user status inference is to infer user status
from mobile contextual information by Bayesian
network probability model. It is similar to the
method in (Kirpipaa et al., 2003) which focused on
simplification and modularization of the complex
Bayesian network model. We referred to context
hierarchy and activity classification of GSS (General
Social Survey) for the design of Bayesian network
probability model. This structure increases the
scalability of the model and permits tradeoff
between precision and recall by manipulating
conditional probability of virtual nodes.
))(|(),...,(
11
1
i
t
T
t
N
i
i
tT
ZPaZPZZP
∏∏
==
=
(2)
Figure 4: A structure of dynamic Bayesian network to
predict user activity.
Since the user activity can depend on his/her
previous activity, place, time and day of week we
modeled this dependence according to time using
dynamic Bayesian network, and attempted to predict
the user activity. Dynamic Bayesian network used in
this paper has a structure like naïve Bayes, which all
evidence nodes in the past is connected to the query
node. For parameter learning, MLE (maximum
likelihood estimation) was used. Fig. 4 is an
example of dynamic Bayesian network used in this
paper to predict an activity. To predict an activity at
t=T, we made dynamic Bayesian network model
with the attributes of activity, place, day and time
from t=T-1 to t=T-3.
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
314
3.3 Predicting Activities
Learning of prediction model was conducted by
user, and the prediction model of all users (group
model) was also learned. This group model was used
for new users until his/her data were stored enough
to provide reliable performance. After enough data
were obtained, individual user models were learned,
and they were used for activity prediction. If
activities can be predicted correctly, useful service
like information recommendation for that activity
can be provided.
4 EXPERIMENTS
We evaluated the proposed prediction model with
the data collected from twelve users (college
students) in ambient intelligence environment. The
accuracy, how correctly the proposed model
predicted the following activity from the sequence of
mobile context, was used as an evaluation measure.
4.1 Experimental Data
(a) Samsung m-4650 (b) BT-335
Figure 5: Smartphone (a) and portable GPS (b) used for
data collection.
Data for experiments were collected for 30 days by
twelve college students. Fig. 5 provides mobile
devices used for data collection. Samsung smart
phone m-4650 was used for context collection, and
portable GPS of BT-335 was used to obtain location
information. Context information includes
automatically collected one such as call record, GPS,
MP3, time, day of week, SMS, photo and manually
annotated user activity and place (from GPS).
4.2 Performance
In order to evaluate the proposed prediction model,
we performed the experiments for calculating
prediction accuracy of user activities in the sequence
of mobile context with collected data. As explained
in preprocessing step, we selected four attributes of
activity, place, time and day of week for modeling
considering the influence to the next activity. We
performed the experiments as changing the number
of sequence, and compared the accuracy with the
same dynamic Bayesian network model using all
collected contexts.
Table 1: Prediction accuracy as the number of sequence
(#: Number, Acc.: Accuracy).
Sequence
size
User #
2 3 4 5 6 7 8 9 10
1
45.6 43.1 42.6 41.9 46.2 47.4 41.5 42.8 37.1
2
74.6 72.9 68.6 65.0 61.0 56.7 52.1 49.0 51.1
3
84.0 81.6 79.8 78.8 78.1 78.2 76.9 76.0 74.9
4
60.6 57.3 54.7 51.6 49.6 45.6 41.9 38.0 36.5
5
73.2 72.2 69.1 62.6 63.9 64.8 66.6 76.5 81.7
6
74.8 69.1 69.5 70.3 65.3 63.0 53.8 51.2 53.3
7
69.3 76.6 77.0 76.0 79.7 77.8 80.1 76.5 75.7
8
52.6 50.0 45.3 48.6 44.8 43.4 40.0 26.9 29.4
9
57.0 51.9 50.5 51.2 53.8 57.7 53.8 48.1 42.1
10
85.8 86.4 85.0 84.6 84.4 85.5 85.9 83.4 84.9
11
66.8 66.8 65.7 62.0 56.0 51.9 49.6 47.4 45.3
12
59.7 55.1 51.6 46.3 44.7 40.6 36.6 34.2 33.3
Avg.
69.0 67.2 65.3 63.6 62.6 61.4 58.6 56.2 55.8
All users
69.3 67.2 64.1 61.6 58.2 56.2 53.4 51.2 49.1
Table 1 summarizes the prediction accuracy of
each user according to the number of sequence. The
pattern of each user is not the same. Half of them
provided the highest accuracy when the number of
sequence is 2 (when model considered the context of
only 1 hour ago). The other half, however, showed
the highest accuracy at different numbers of
sequence. The accuracies are also different one
another now ranging from about 40% to 86%. It
means that the patterns of user activity and context
depend on users, so individual model for each user is
required. The model of all users and average of each
user model also support this result. The latter have to
provide better accuracy because the former learned
the data of all users even though their patterns were
different. Since the number of data for each user was
different, the average was calculated considering the
number of data from each user as weight.
Fig. 6 compares this result in Table 1 with the
result using all attributes, and the model with
selected four attributes provides better accuracy than
one with all attributes for most users. It can be
PREDICTING USER ACTIVITIES IN THE SEQUENCE OF MOBILE CONTEXT FOR AMBIENT INTELLIGENCE
ENVIRONMENT USING DYNAMIC BAYESIAN NETWORK
315
thought that the preprocessing part, which excludes
insignificant attributes, is effective.
Figure 6: Prediction accuracy comparison with one using
all attributes.
5 CONCLUDING REMARKS
This paper proposed the prediction method of user
activity in the sequence of mobile context for
ambient intelligence environment. We collected user
activity, place, time, day of week, call record, MP3,
SMS and photo as mobile context, and modeled the
patterns in the context sequence to predict the user’s
next activity. For better modelling, we used the
activity classification method in GSS and modified it
to college students, which provided context data in
this paper, and used the place classification method
in NHAPS. We selected four attributes of activity,
place, time and day of week among eight attributes
considering the significance, and learned dynamic
Bayesian network model with collected data. We
also made models both for individual users and all
users for new users. In experiments, we evaluated
the proposed prediction method with the collected
data, and confirmed the proposed method provided
good performance.
For future work, we are planning to cover two
more issues. One is user clustering and the other is
recommendation. To deal with general users’
context and activity patterns, user clustering is
required before prediction modeling. It is also useful
for recommender service from perspective of
marketing. After prediction, it will be interesting to
provide useful information to each user based on
predicted user activity. For example, the system can
recommend restaurant information if the model
predicts the following activity is restaurant meals
with friends. A service like this will make ambient
intelligence a smarter one.
ACKNOWLEDGEMENTS
This research was supported by Basic Science
Research Program through the National Research
Foundation of Korea (NRF) funded by the Ministry
of Education, Science and Technology (R01-2008-
000-20801-0)
REFERENCES
Cai, H., Hu, X., Lu, Q., Cao, Q., 2009. A novel intelligent
service selection algorithm and application for
ubiquitous web services environment. Expert Systems
with Applications. Elsevier.
Silva, G. C., Yamasaki, T., Aizawa, K., 2005. Evaluation
of video summarization for a large number of cameras
in ubiquitous home. ACM International Conference on
Multimedia.
Gemmell, J., Bell, G., Lueder, R., Drucker, S., Wong, C.,
2002. MyLifeBits: Fulfilling the Memex vision. ACM
International Conference on Multimedia.
Bellavista, P., Kupper, A., Helal, S., 2008. Location-based
services: Back to the future. IEEE Pervasive
Computing. IEEE.
Korpipaa, P., Mantyjarvi, J., Kela, J., Keranen, H., Malm,
E.-J., 2003. Managing context information in mobile
devices. IEEE Pervasive Computing. IEEE.
Cho, S.-B., Kim, K.-J., Hwang, K.-S., Song, I.-J., 2007.
AniDiary: Daily cartoon-style diary exploits Bayesian
networks. IEEE Pervasive Computing. IEEE.
Ermes, M., Parkka, J., Mantyjarvi, J., Korhonen, I., 2008.
Detection of daily activities and sports with wearable
sensors in controlled and uncontrolled conditions.
IEEE Transactions on Information Technology in
Biomedicine. IEEE.
Han., S.-J., Cho, S.-B., 2006. Learning trajectory
information with neural networks and the Markov
model to develop intelligent location-based services.
Journal of Information and Knowledge Management.
World Scientific.
Dey, A. K., 2001. Understanding and using context.
Personal and Ubiquitous Computing. Springer.
Kleiter, G. D., 1996. Propagating imprecise probabilities
in Bayesian networks. Artificial Intelligence. Elsevier.
Korpipaa, P., Koskinen, M., Peltola, J., Makela, S.-M.,
Seppanen, T., 2003. Bayesian approach to sensor-
based context awareness. Personal and Ubiquitous
Computing. Springer.
Hrovitz, E., Kadie, C. M., Paek, T., Hovel, D., 2003.
Models of attention in computing and
communications: From principles to applications.
Communications of the ACM. ACM.
Klepeis, N. E., et al., 2001. The national human activity
pattern survey (NHAPS): A resource for assessing
exposure to environmental pollutants. Journal of
Exposure Science and Environmental Epidemiology.
Nature Publishing Group.
Murphy, K., 2002. Dynamic Bayesian networks:
Representation, inference and learning. PhD Thesis.
University of California Berkley.
ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence
316