Optimal Sensor Placement for Human Activity Recognition with a

Minimal Smartphone–IMU Setup

Vincent Xeno Rahn

, Lin Zhou

, Eric Klieme

and Bert Arnrich

Hasso Plattner Institute, University of Potsdam, 14482 Potsdam, Germany

Keywords:

Human Activity Recognition, HAR, Inertial Measurement Unit, IMU, Smartphone, Sensors, Convolutional

Neural Network, CNN, Deep Learning.

Abstract:

Human Activity Recognition (HAR) of everyday activities using smartphones has been intensively researched

over the past years. Despite the high detection performance, smartphones can not continuously provide reli-

able information about the currently conducted activity as their placement at the subject’s body is uncertain. In

this study, a system is developed that enables real-time collection of data from various Bluetooth inertial mea-

surement units (IMUs) in addition to the smartphone. The contribution of this work is an extensive overview

of related work in this ﬁeld and the identiﬁcation of unobtrusive, minimal combinations of IMUs with the

smartphone that achieve high recognition performance. Eighteen young subjects with unrestricted mobility

were recorded conducting seven daily-life activities with a smartphone in the pocket and ﬁve IMUs at different

body positions. With a Convolutional Neural Network (CNN) for activity recognition, activity classiﬁcation

accuracy increased by up to 23% with one IMU additional to the smartphone. An overall prediction rate of

97% was reached with a smartphone in the pocket and an IMU at the ankle. This study demonstrated the

potential that an additional IMU can improve the accuracy of smartphone-based HAR on daily-life activities.

1 INTRODUCTION

Human Activity Recognition (HAR) enables retrieval

of high-level knowledge from low-level sensor inputs

(Chen et al., 2019) and is capable of monitoring daily-

life activities as walking, sitting, or running. Impor-

tant applications lay in the ﬁeld of healthcare in terms

of physical monitoring (Zhang and Sawchuk, 2012).

For example, HAR can inform subjects about irreg-

ularities as early as possible for diagnosis and direct

treatments.

HAR is commonly performed using inertial mea-

surement units (IMUs). An IMU is a combination of

multiple inertial sensors: an accelerometer (measures

acceleration), a gyroscope (measures angular veloc-

ity), and sometimes a magnetometer (measures mag-

netic ﬁeld) (Ahmad et al., 2013). An IMU can be used

as a standalone device or integrated into other devices

like smartphones. Recent advancements in hardware

and a growing variety of standalone IMU devices

https://orcid.org/0000-0001-5241-6700

https://orcid.org/0000-0001-9916-3878

https://orcid.org/0000-0001-7032-2230

https://orcid.org/0000-0001-8380-7667

(Zhou et al., 2020) led to increasing applications us-

ing IMUs (Zhu and Sheng, 2009). The fact that many

of them now also support wireless communication

protocols allows smartphones or computers to receive

sensor data in real-time. Meanwhile, smartphone-

based HAR has been intensively researched over the

past years. Because smartphones require no instal-

lation costs, are user-friendly, and provide an un-

obtrusive way of recording data in daily situations,

they have become a standard tool for HAR (Su et al.,

2014).

To date, a large number of studies exist investigat-

ing HAR. Some of them focus on HAR using smart-

phone sensors (Su et al., 2014; Ghosh and Riccardi,

2014; Bayat et al., 2014) and others on HAR using

body-worn standalone IMU devices (Altun and Bar-

shan, 2010; Huynh, 2008; Janidarmian et al., 2017),

few of them transferring data via Bluetooth (Bulling

et al., 2014; Khan et al., 2010). There have also

been some studies determining the highest accuracy-

achieving sensor placements (Atallah et al., 2011;

Orha and Oniga, 2014; Mannini et al., 2015).

Rahn, V., Zhou, L., Klieme, E. and Arnrich, B.

Optimal Sensor Placement for Human Activity Recognition with a Minimal Smartphone–IMU Setup.

DOI: 10.5220/0010269100370048

In Proceedings of the 10th International Conference on Sensor Networks (SENSORNETS 2021), pages 37-48

ISBN: 978-989-758-489-3

1.1 Motivation

Smartphones as recording devices are a convenient

solution for HAR as they have built-in motion sen-

sors such as accelerometer or gyroscope. However,

the sensor data received from smartphones might also

be incorrect or misleading in many situations, even if

only considering all the different ways people carry

their phones or how they interact with them during

the day. Activities such as writing a short text mes-

sage and holding it in the hand, laying it on a table,

or putting it in another pocket are just a few examples

where correct activity recognition is more difﬁcult.

Another important requirement for HAR is to en-

able the observed subjects to behave as naturally

as possible. This can not be achieved if the com-

plete body is covered with sensors, because this setup

would only work for short term applications as in a

hospital setting, but not for everyday activities. A sys-

tem that addresses these problems and enables high

accuracy HAR directly from the smartphone of the

subject is needed.

1.2 Contribution

In this paper, an activity recognition system is pre-

sented. Its main contribution was the determination

of an optimal minimum smartphone and IMU sensor

setup improving HAR results for basic daily-life ac-

tivities. This work’s concrete contributions are as fol-

lows:

• A dataset of 18 participants performing seven dif-

ferent every-day activities was collected in an ex-

periment (Section 3). To the best of our knowl-

edge, this is the ﬁrst study combining Bluetooth

sensors with internal smartphone sensors data col-

lection for HAR.

• It was proved that the HAR performance of the

smartphone in the subject’s pocket could be im-

proved by 23% if combined with a body-worn

standalone IMU device. The highest improve-

ments are reached by the ankle and the lower

back. F1-scores of up to 97% are reached using

a Convolutional Neural Network (Section 4.4).

• Also, it was shown that some single IMU place-

ments achieve high recognition precisions (F1-

scores around 87%), making the resulting recog-

nition more independent from the smartphone in

case it produces imprecise data (Section 4.4).

Beginning with an introduction of daily-life activity

recognition in related work by giving an overview of

similar studies and their applied methods (Section 2),

followed by a description of the conducted experi-

ment (Section 3) and its evaluation (Section 4), this

work concludes by discussing future research (Sec-

tion 5).

2 RELATED WORK

2.1 Background

The recognition of daily life human activities is a pop-

ular problem. There are several common approaches

and varying factors such as probed activities and cho-

sen sensor setups. This section provides a review of

methods for HAR with a special focus on daily-life

activity recognition.

2.1.1 Human Activity Recognition Process

A basic Activity Recognition Process (ARP) consists

of ﬁve steps. Data is collected from sensor signals.

The acquired data might consist of artifacts arising

from malfunctions, simultaneously occurred physical

activities or electronic ﬂuctuations. Thus, the data is

preprocessed. It is segmented into windows of a spe-

ciﬁc length and labeled with the activity that was con-

ducted in this segment. In the next step, every time

window is transformed into a vector of features (De-

hghani et al., 2019). In Feature extraction for HAR, it

is challenging to produce distinguishable features due

to the similarity of activities that might share simi-

lar characteristics (e.g. walking and running) (Chen

et al., 2020). Finally, based on the data and its corre-

sponding labels, a classiﬁer is trained. According to

(Dehghani et al., 2019), Decision Tree, Na

ıve Bayes,

Support vector machine, K-nearest neighbors, Hidden

Markow Models, and ensemble classiﬁers as Random

Forest are common and preferred classiﬁers in HAR.

2.1.2 Activities

According to Table 1 and Table 2, activity sets in sim-

ilar studies often include walking, sitting, and stand-

ing, sometimes combined with primary activities as

eating or vacuuming. Some studies investigated even

more complex daily-life activities, for instance, (Atal-

lah et al., 2011) the wiping of tables or (Valarezo

et al., 2017) the folding of laundry.

2.1.3 Inertial Sensors

In (Zhang and Sawchuk, 2012), the accelerome-

ter proved to be the best performing motion sensor

to recognize sitting, walking, climbing upstairs and

SENSORNETS 2021 - 10th International Conference on Sensor Networks

Table 1: Related work determined the best sensor positions for speciﬁc activities using accelerometers or IMUs.

Activities Best Positions

(Atallah et al., 2011)

lying down wrist

preparing food, eating and drinking, socializing, read-

ing, getting dressed

waist

walking, treadmill walking, vacuuming, wiping tables chest, wrist

running, treadmill running, cycling ear, arm, knee

sitting down and getting up, lying down and getting up waist, chest, knee

(Orha and Oniga, 2014)

standing, sitting, supine, prone, left lateral recum-

bent, right lateral recumbent, walking, running, for-

ward/left/right bending, squats, settlements and lifting

the chair, falls, turn left and right, upstairs, downstairs

right thigh, right hand

(Bao and Intille, 2004)

ambulation, posture thigh, hip, ankle

upper body movements (sitting, reading, watching TV) wrist, arm

total of 20 everyday activities thigh, wrist / hip, wrist

(Mannini et al., 2015) walking ankle, thigh

(Bulling et al., 2014)

opening/closing window, watering plant, reading, drink-

ing a bottle, cutting/chopping with a knife, stirring in a

bowl, forehand, backhand, smash

wrist

downstairs, riding the elevator up and down, and

brushing teeth. To detect falling, the rotation an-

gle retrieved from the gyroscope increases the per-

formance. Therefore, the accelerometer and gyro-

scope improve the reliability of the recognition pro-

cess by complementing each other. The researchers

in (Shoaib et al., 2013) determined that climbing up-

stairs has a high recognition accuracy by a gyroscope

at most positions while for example standing is better

recognized by an accelerometer. They also showed

that the magnetometer has a high dependence on di-

rections and is thus causing over-ﬁtting in training

classiﬁers (Shoaib et al., 2013).

2.1.4 Sensor Setup

The results of an ARP heavily depend on the cho-

sen sensor placements (Mohamed et al., 2018). An

overview of related work examining best achieving

sensor positions using accelerometers can be found in

Table 1. For basic daily life activities, placements on

the wrist, the knee, the waist, and the thigh seem to

provide high classiﬁcation accuracy. As for the ref-

erenced studies, a combination of arm and leg covers

most of the activities. As of (Bao and Intille, 2004),

complex activities require at least one sensor on the

upper and one on the lower body.

Typical frequencies for daily activities are for ex-

ample 27 Hz (Orha and Oniga, 2014), 30 Hz (Figueira

et al., 2016), 32 Hz (Bulling et al., 2014), 50 Hz (De-

hghani et al., 2019), or 100 Hz (Gao et al., 2019).

(Mannini et al., 2015) sampled walking data down to

30 Hz and did not see a difference in recognition accu-

racy. In (Ghosh and Riccardi, 2014) the classiﬁcation

accuracy drops signiﬁcantly for sampling rates lower

than 10 Hz.

2.1.5 Data Preprocessing

The chosen window can vary in size and can be either

overlapping or non-overlapping. (Dehghani et al.,

2019) found, that when subject-independent cross-

validation is used, the performance of HAR systems

can not be improved by using overlapping sliding

windows instead of non-overlapping windows. The

window size has a signiﬁcant inﬂuence on the accu-

racy of the ARP. To be able to differentiate the ac-

tivity from others, the window should include at least

one instance of the activities’ repeating action such

as taking a step for walking. On the other hand, an

increased window size does not necessarily improve

recognition performance (Janidarmian et al., 2017).

As of (Banos et al., 2014), the most accurate detec-

tion is achieved with short windows of two seconds

or smaller, very short windows (0.25−0.5 s) lead to a

very good recognition performance.

2.2 Overview of Related Work

As already mentioned in Section 1, there have been

a lot of studies regarding HAR. In Table 1, there has

been presented an excerpt of studies investigating op-

timal sensor combinations for accelerometers as well

as IMUs. In the following, a short survey of further

studies dealing with daily-life activity recognition is

given. Table 2 presents a more detailed overview.

A large number of studies aimed to distinguish be-

tween different activities by using one or multiple ac-

Optimal Sensor Placement for Human Activity Recognition with a Minimal Smartphone–IMU Setup

Table 2: Overview of HAR studies using accelerometers and HAR studies using IMUs.

Accelerometer Placements

(total number)

Activities

(total number)

Sampling

Rate

Subj. (Best) Method Accur.

(Foerster et al., 1999) sternum, wrist, thigh, lower

leg (5)

sitting, standing, lying, sitting and talking, working at keyboard, walking,

stairs up, stairs down, cycling (9)

16 Hz 24 template

matching

95.8%

(Mantyjarvi et al.,

2001)

3 left hip, 3 right hip (6) walking, upstairs, downstairs, opening doors (4) 256 Hz 6 independent

component

analysis

83% −

90%

(Bao and Intille, 2004) hip, wrist, upper arm, ankle,

thigh (5)

walking, sitting, standing, watching TV, running, stretching, scrubbing,

folding laundry, brushing teeth, riding elevator, riding escalator, climbing

stairs, walking carrying items, working on computer, eating or drinking,

reading, cycling, strength-training, vacuuming, lying down (20)

76.25 Hz 20 decision tree

classiﬁers

84%

(Tapia et al., 2007) hip, thigh, ankle, upper arm,

wrist (5)

in different variations: lying down, standing, sitting, walking, running,

climbing stairs, cycling, carrying weight, moving weight, and more gym

activities (30)

30 Hz 21 C4.5 classiﬁer 94.9%

(Krishnan and

Panchanathan, 2008)

hip, ankle, thigh (3) walking, sitting, standing, running, cycling, lying, climbing stairs (7) 76.25 Hz 20 AdaBoost 93%

(Bonomi et al., 2009) lower back (1) lying, sitting, working on computer, standing, washing dishes, walking,

downstairs, upstairs, walking outside, running, cycling (11)

20 Hz 20 decision tree

classiﬁers

93%

(Mannini and

Sabatini, 2010)

hip, wrist, arm, ankle, thigh

(5)

walking, sitting, standing, running, cycling, lying, climbing stairs (7) 76.25 Hz 20 NM classiﬁer 98.5%

(Mannini et al., 2015) ankle, thigh, hip, arm, wrist

(5)

lying, sitting, sorting ﬁles on paperwork, cycling, natural walking,

treadmill walking, carrying a load, stairs or elevator, jumping-jacks,

sweeping with a broom, painting with roller/brush (28)

90 Hz 33 cross-validation

with SVM

91.2%

(Janidarmian et al.,

2017)

waist, lower arms, upper

arms, lower legs, upper legs,

chest (10)

combined datasets (70) 8 - 100 Hz 228 principal

component

analysis

96.44% ±

1.62%

IMU Placements

(total number)

Activities

(total number)

Sampling

Rate

Subj. (Best) Method Accur.

(Altun and Barshan,

2010)

chest, arms, legs (5) sitting, standing, lying down on back/right side, upstairs, downstairs,

standing/moving in elevator, walking, walking on treadmill, running on

treadmill, exercise on stepper/cross trainer, cycling, rowing, jumping,

playing basketball (19)

25 Hz 8 BDM

classiﬁcation

95%

(Zebin et al., 2016) pelvis, thighs, shanks (5) walking, upstairs, downstairs, sitting, standing, lying down (6) 50 Hz 12 Convolutional

Neural Network

97.01%

(Rivera et al., 2017) wrist (1) open door, close door, open fridge, close fridge, clean table, drink from

cup (8)

100 Hz 12 Recurrent

Neural Network

80.09%

(Valarezo et al., 2017) wrist (1) lying, sitting, standing, walking, running, cycling, Nordic walking,

ironing, vacuuming, rope jumping, upstairs, downstairs, optional:

watching TV, computer work, driving car, folding laundry, cleaning

house, playing soccer (12/6)

100 Hz 9 Recurrent

Neural Network

96.95%

SENSORNETS 2021 - 10th International Conference on Sensor Networks

celerometers. (Foerster et al., 1999) for example as

one of the earliest works on this topic dealt with 5 ac-

celerometers attached to 24 subjects to recognize pos-

tures and motions by using a hierarchical classiﬁca-

tion model resulting in an accuracy of 95.8%. On only

four basic activities performed by six participants,

(Mantyjarvi et al., 2001) reached an overall recogni-

tion accuracy of 83% − 90%. With 93%, (Bonomi

et al., 2009) accomplished a bit higher recognition

performance on seven more activities and only one ac-

celerometer placed at the lower back. (Bao and Intille,

2004) equipped the body with 5 accelerometers at the

hip, wrist, upper arm, ankle, and thigh to differenti-

ate between 20 activities performed by 20 subjects,

resulting in a similar performance using decision tree

classiﬁers. From the generated dataset by (Bao and

Intille, 2004), 3 accelerometers have been selected

to recognize seven lower body activities in (Krish-

nan and Panchanathan, 2008). The accuracy reached

93%. (Mannini and Sabatini, 2010) applied several

learning methods on the same dataset excerpt, but

with additional wrist and arm sensor and achieved an

accuracy of 98.5%. In (Janidarmian et al., 2017), sev-

eral datasets have been combined to generate a dataset

consisting of 70 activities performed by 228 subjects,

recorded by 10 accelerometers covering most relevant

body positions. The results are impressive as an accu-

racy of 96.44% ± 1.62% was achieved using a prin-

cipal component analysis.

While the accelerometer provides good results as

seen for the previously introduced studies, IMU sen-

sors also come with an additional gyroscope and mag-

netometer. Instead of single accelerometers, few stud-

ies used an IMU sensor and achieved high recognition

performances. (Altun and Barshan, 2010) retrieved

95% from ﬁve IMUs placed at the chest and each leg

and arm. Eight subjects had conducted 19 activities,

including standard activities but also rowing, jump-

ing, and playing basketball. The single IMU setup

on the participant’s wrist was tested by (Rivera et al.,

2017) as well as (Valarezo et al., 2017). Latter re-

trieved 96.95% accuracy on 18 different activities per-

formed by nine subjects. (Zebin et al., 2016) used a

Convolutional Neural Network with an accuracy of

91.01% when recognizing walking, upstairs, down-

stairs, sitting, standing, and lying down composed by

12 subjects from ﬁve IMUs.

2.3 Learnings from Related Work

To summarize, there have already been studies inves-

tigating HAR using accelerometers and IMUs, mostly

reaching a recognition accuracy between 80% and

97%. Thus, IMUs seem to be sufﬁcient for recogniz-

ing basic daily-life activities. There were also works

comparing different combinations for speciﬁc activi-

ties to ﬁnd out about optimal sensor positions, sam-

pling rates, and window sizes. It appears that the op-

timal combinations heavily depend on the conducted

activities.

Optimal sensor positions were aimed to be found

in related work for independent accelerometer or IMU

setups, but having the smartphone in the pocket as a

ﬁxed sensor and testing the improvement of a combi-

nation was not considered.

Besides, to provide HAR from IMUs on the body,

a device capable of complex computations and pro-

viding enough storage is required. The data trans-

fer in the above-mentioned studies mostly happened

between the IMUs and a computer to avoid process-

ing and storage restrictions of other devices as smart-

phones. Thus, a computer or even a wire must be

present, which does not allow the natural conduction

of daily activities.

The contribution of this work is to investigate the

inﬂuence of different IMU positions for smartphone-

based HAR.

3 EXPERIMENTAL SETUP

3.1 Recording Sequence

Following related work, the most commonly inves-

tigated daily activities were selected: lay, sit, stand,

walk, climb upstairs, climb downstairs, and run.

To collect comparable data, a recording sequence

was created. Every subject performed this sequence at

a stretch with short pauses between the different activ-

ities. The pauses were needed for (1) a more efﬁcient

data segmentation and (2) giving the subjects enough

time to reach the next ”station” (for example the chair

to sit). See Table 3 for the complete sequence. Man-

ual adjustment of the labeled data was not necessary

due to the strict schedule.

The experiment was carried out in the foyer of a

building and the open area in front of it. Sitting, lying,

standing, walking, and running were easy to realize

using chairs, tables, and a mat. For the stair climb-

ing activities, a long staircase consisting of continu-

ous steps was used. The only drawback was a slightly

longer step in the middle. The participants were asked

to take this step with one move.

The study was conducted in accordance with the

latest revision of the declaration of Helsinki (Rick-

ham, 1964). All subjects gave their consent before

participating in the data collection procedures. The

Optimal Sensor Placement for Human Activity Recognition with a Minimal Smartphone–IMU Setup

Table 3: Recording sequence, pauses of 10 - 40 seconds between the activities.

No. Activity Time (s) No. Activity Time (s) No. Activity Time (s)

0 walk 20 8 sit 20 16 run 20

1 sit 20 9 walk 20 17 run 20

2 lay 20 10 lay 20 • • •

3 stand 20 11 stand 20 • • •

4 downstairs 10 12 downstairs 10 • • •

5 upstairs 10 13 upstairs 10 • • •

6 downstairs 10 14 downstairs 10 • • •

7 upstairs 10 15 upstairs 10 • • •

Total 280

data collection was supervised by the authors to en-

sure the quality of the data.

3.2 Sensor Setup

A One Plus 6 smartphone (One Plus, China) and ﬁve

QuantiMotion IMUs (Bonsai Systems



, Switzerland)

were used for data collection in this study. The smart-

phone was placed in the front right pocket of the sub-

jects, where it might be worn in daily situations. The

IMUs have been placed asymmetrically on both body

sides covering the wrist, ankle, upper arm, and upper

leg, the ﬁfth IMU was located at the lower back. As

seen in Section 2.1.4 regarding best achieving sensor

positions in related work, the wrist and thigh provide

high classiﬁcation accuracy for daily life activities.

Further sensors were added to the ankle, the arm, and

the lower back to cover the rest of the body positions

mostly used in other studies.

Based on preliminary experiments, accelerome-

ter and gyroscope data were recorded at 100 Hz and

downsampled to 80 Hz. Several other sampling rates

were tested in addition, in order to investigate further

the impact of sampling rate on activity recognition ac-

curacy. The acceleration and gyroscope range were

±16 g and ±2000

◦

/s, respectively.

3.3 Recording Tools

A custom Android app was developed for sensor data

collection. The app is capable of recording data from

the internal sensors in the smartphone as well as mul-

tiple external Bluetooth Low Energy (BLE) IMUs.

For the current experimental setup, the app provides

the subject auditory instructions of the pre-deﬁned

recording sequence (e.g., walk, sit, stand), and labels

the recorded data with the activities. Using basic text-

to-speech components, the subject gets told what hap-

pens next, thus can prepare, and start/stop the activ-

ity at the right time. This produces clear separated

data. It was furthermore ensured that if the Bluetooth

Front Back

Figure 1: The ﬁve Bonsai IMUs were placed on the left

thigh, right ankle, left arm, right wrist, and lower back. The

smartphone with the recording app (see Section 3.3) was

placed in the right pocket.

connection to the IMUs gets lost, the experiment is

paused and resumed at the moment the connection is

reestablished.

4 EVALUATION

4.1 Dataset

In this experiment, the data was composed of 18 sub-

jects. Seven different activities (lay, sit, stand, walk,

climb upstairs, climb downstairs, and run) were per-

formed while receiving data from the six used sen-

sors (placing see Figure 1). One subject was not able

to climb stairs and run, so only lying, sitting, stand-

ing, and walking was recorded. Overall, 720 seconds

duration for lying, sitting, standing and walking, and

680 seconds for stair climbing were recorded.

The dataset was split into train, validation, and test

SENSORNETS 2021 - 10th International Conference on Sensor Networks

0 2 4

8 10

time (s)

−20

−10

acceleration (m/s

)

(a) Raw Data

0 2 4

8 10

time (s)

−20

−10

(b) Resampled Data (80 Hz)

0 2 4

8 10

time (s)

−15

−10

−5

Figure 2: Preprocessing of acceleration data along the x-axis shown on an example upstairs activity recorded at the left thigh.

data. Different subjects were randomly selected for

each of these groups. See Table 4 for the distribution.

Table 4: Subjects’ characteristics. Data is presented as

mean ± standard derivation.

All Train Valid. Test

Count 18 9 3 6

(f/m) (4/14) (2/7) (1/2) (1/5)

Age 21.89 22.56 20.67 21.5

± 3.59 ± 4.92 ± 0.47 ± 0.76

Weight 69.11 71.22 66.67 67.17

(kg) ± 7.64 ± 8.8 ± 2.36 ± 6.54

Height 1.77 1.77 1.76 1.77

(m) ± 0.06 ± 0.05 ± 0.03 ± 0.08

4.2 Preprocessing and Segmentation

The raw acceleration and gyroscope data received

from the devices were downsampled to the deﬁned

sampling rate of 80 Hz and ﬁltered through a Butter-

worth low-pass ﬁlter to remove high-frequency noise.

This ﬁlter attenuates higher frequency components of

the signal beyond a conﬁgurable cut-off frequency

(Butterworth, 1930). (Wang et al., 2011) proposed for

the processing of body sensor networks a 3rd order

Butterworth low-pass ﬁlter with a cut-off frequency

of 4 Hz. The same ﬁlter speciﬁcation has been applied

in this experiment. An example sequence of a random

subject climbing upstairs and the three preprocessing

stages are shown in Figure 2.

In related work, a window length between 0.25

and 0.5 seconds was proposed. Thus, the ﬁltered data

was segmented into non-overlapping windows with a

size of 0.5 s.

4.3 Classiﬁcation

Previous works successfully used machine learning

methods for HAR (Chen et al., 2020). In two of the

latest studies, Convolutional Neural Networks (CNN)

were selected as the method achieving the highest

recognition precision (Ignatov, 2018; Yang et al.,

2015). (Zebin et al., 2016) conducted a very similar

experiment to the one in this paper but only using a

ﬁxed set of body-worn inertial sensors. They aimed to

distinguish between the same activities as in this work

except for running. By using a CNN, they reached a

recognition accuracy of 97.01%. Thus, a CNN was

composed to recognize the activities in this experi-

ment. CNNs are inspired by the biological visual sys-

tem, they provide a hierarchical feed-forward neural

network. A CNN has convolutional layers to learn ﬁl-

ters sliding along the input data (Ignatov, 2018) in ad-

dition to fully-connected layers of the original neural

networks. The convolution and sampling layers work

as feature extractors in the ARP (Almaslukh et al.,

2018).

The CNNs for this study consist of three convo-

lutional, three pooling, and one fully connected layer.

(Ignatov, 2018) for example used 1 × 16 ﬁlters, while

in our study, three 1 × 2 convolution layers separated

by pooling layers have been tested to achieve similar

results but increase the training velocity. The num-

ber of channels depends on the number of IMUs an-

alyzed. Each IMU brings six channels (acceleration

and gyroscope in all three axes). For example, if the

combination of smartphone and the IMU at the right

wrist is investigated, this results in twelve channels

(see Figure 3 for the CNN architecture). Based on

the amount of sensor combinations tested, there have

been trained twelve CNNs. A batch size of 50 and

a learning rate of 0.001 were chosen for training the

network in 200 epochs.

4.4 Results

The main goal of this work was to determine the im-

pact of different IMU combinations and ﬁnding the

optimal minimal setup. Thus, single IMU setups and

combinations of the smartphone and an additional

IMU were investigated. An overview of all possi-

ble combinations tested and their results are plotted

in Figure 4.

This results show that regarding combinations of

the smartphone and an additional IMU, combinations

of the smartphone in the right pocket and the IMUs

at the right ankle and the lower back provide promis-

Optimal Sensor Placement for Human Activity Recognition with a Minimal Smartphone–IMU Setup

Figure 3: CNN architecture for classiﬁcation of data from two IMUs.

ing F1-scores of approximately 97% and 96%. As

shown in the confusion matrices in Figure 5, the ap-

plied classiﬁer had for both most problems recogniz-

ing the stairs and walking activities, and for the right

ankle also with lying while for the lower back the

prediction of standing was wrong a few more times.

Both classiﬁed sitting always correct, but sometimes

the classiﬁer on the right ankle combination confused

laying and climbing downstairs as sitting.

70 80 90 100

Pocket + Right ankle

Pocket + Lower back

All

Pocket + Left arm

Lower back

Left arm

Right ankle

Pocket + Right wrist

Pocket

Pocket + Left thigh

Left thigh

Right wrist

F1-score

Recall

Precision

Figure 4: Recognition performance for the different sensor

combinations. Pocket describes the smartphone in the right

pocket.

Furthermore, the combination of the pocket and

the left arm produces a high F1-score of 94%. For this

combination, walking and the stairs activities were

miss-classiﬁed by more than 10%, but the remaining

activities were recognized nearly correctly. The other

smartphone and IMU combinations have most difﬁ-

culties recognizing sitting, lying, and walking. Stand-

ing, the stair climbing activities and running achieve

high performance. They all score lower than 81%.

In comparison to the remaining activities, walking

Walk

Sit

Stand

Lay

Upstairs

Downstairs

Run

Predicted activity

Walk

Sit

Stand

Lay

Upstairs

Downstairs

Run

Actual activity

95.28

0 0 0 2.36

2.1

0.26

0 100 0 0 0 0 0

0 0

99.25

0 0.25 0.5 0

0 8.27 0

91.73

0 0 0

0 0.25 0

94.5

3.25 0

0.25 0 0 1.25

0.5

0 0 0.25 0 0 0.25

99.5

(a) Pocket and right ankle

Walk

Sit

Stand

Lay

Upstairs

Downstairs

Run

Predicted activity

Walk

Sit

Stand

Lay

Upstairs

Downstairs

Run

Actual activity

94.49

0 0.26 0 0.26

4.99

0 100 0 0 0 0 0

0.5 0

93.5

0 0.75 5.25 0

0 0 0 100 0 0 0

0 0 0

90.25

2.25 0.5

4.75 0 0 0 0.25

92.5

2.5

0 0 0 0 0.25 0

99.75

(b) Pocket and lower back

Figure 5: Confusion matrices of IMU combinations with

the highest performance.

and the stairs activities were predicted wrong more

often. This might be related to the similarity of the

three activities. Maybe also the long step in the mid-

SENSORNETS 2021 - 10th International Conference on Sensor Networks

dle of the stairs caused the subjects to walk similar to

normal walking for a short time because they had to

make a big step.

Table 5: CNN classiﬁcation performance of the different

single IMUs.

Position Precision Recall F1-score

Right pocket

(Smartphone)

73.85% 73.42% 73.43%

Left thigh 65.46% 64.79% 62.44%

Right ankle 84.55% 84.86% 84.22%

Left arm 86.76% 86.42% 85.65%

Right wrist 65.67 % 63.36% 60.59%

Lower back 87.48% 87.18% 87.00%

Table 6: CNN classiﬁcation performance from the combi-

nations of the smartphone in the right pocket with another

IMU sensor.

Sensor added Precision Recall F1-score

- 73.85% 73.42% 73.43%

Left thigh 69.01% 67.56% 67.64%

Right ankle 97.11% 96.75% 96.89%

Left arm 93.99% 93.95% 93.96%

Right wrist 81.56% 81.18% 80.69%

Lower back 95.93% 95.78% 95.79%

In Table 5, the results of single sensor setups are

presented. Also as such a single sensor setup, the

IMUs at the lower back and the right ankle reach

the highest recognition performances with 87% and

84.22%, respectively. The smartphone in the right

pocket as a single IMU results in a F1-score of

73.43%. The achieved improvements by combining

the smartphone with IMUs are shown in Figure 6.

Merged with the body-worn standalone IMU devices,

gains of up to 23.46% are reached with the best com-

binations as presented above. The worst combination

is the addition of the IMU at the left thigh which re-

sults in the score to decrease by nearly 6%. This

might be caused by both sensors being at similar posi-

tions (the smartphone in the right pocket and the IMU

at the left upper leg) and them producing redundant

information as well as missing upper body details.

With lower sampling rates, the recording IMU de-

vices would consume less energy, however, less in-

formation may be preserved. With various window

sizes, the classiﬁcation performance could ﬂuctuate

drastically. To test the robustness of the classiﬁcation

results in terms of energy consumption and perfor-

mance, the impacts of lower sampling rates and other

window sizes were analyzed. The dataset was down-

sampled to different sampling rates, segmented in dif-

ferent window sizes, and fed into the CNN. The sam-

Right ankle

Lower back

Left arm

Right wrist

Left thigh

100

F1-score smartphone F1-score combined

Figure 6: Recognition improvement if the smartphone is

combined with each body-worn standalone IMU device.

pling rate appears to make no signiﬁcant difference

regarding the resulting test performance, but a drop

of 5% at most can be noticed comparing the results

from 80 Hz to those from 10 Hz sampled data. Lower

sampling rates still provide sufﬁcient results, but ac-

curacy is signiﬁcantly lost at frequencies lower than

6 Hz as already investigated by (Klieme et al., 2018).

It can be seen that different window sizes also do not

make a big change regarding the achieved scores, as

long as they do not exceed 2 s and do not fall below

0.25 s. The 0.25−0.5 s that have been proposed by

(Banos et al., 2014) also achieve high F1-scores for

our dataset, but for this experiment, high recognition

performance is also received for windows of 2 s.

The processed data was recorded in a laboratory

setting. (Foerster et al., 1999) showed, that an ac-

curacy of 95.6% for ambulation activities performed

in controlled data collection environments could be

reached. But the accuracy decreased to 66% when

instead using naturally recorded data. The high per-

formance of the processed data in this study could

hence be strongly related to its controlled genera-

tion. Regarding the recorded subjects, it is also es-

sential to notice that apart from the one person men-

tioned above, those were all participants with unre-

stricted mobility between 19 and 36 years. Hence

the achieved results might be invalid for other groups

as for people with health impairments or elderly pa-

tients. There have also been some difﬁculties while

recording. BLE in combination with the Bonsai IMUs

sometimes caused disconnections and forced the ex-

periment to be re-started. This interruption of the

recording protocol might have falsiﬁed the results.

Optimal Sensor Placement for Human Activity Recognition with a Minimal Smartphone–IMU Setup

5 CONCLUSIONS

Based on an experiment with 18 subjects, it was

shown that an additional Bluetooth IMU sensor ar-

rangement can improve the robustness of smartphone-

based HAR for daily-life activities. Basic activities

can be recognized with a high accuracy depending

on the chosen sensor placements. The used CNN

reached a high F1-score of 96.89% for a combination

of the smartphone in the pocket and an IMU on the

right ankle. Furthermore, combining an IMU on the

lower back with the smartphone resulted in a score of

95.79% and the combination of the smartphone and

the left upper arm in a score of 93.96%. Thus, similar

results are achieved compared to related work such

as the work by (Zebin et al., 2016). They reached

97.01% using ﬁve IMUs combined on six basic ev-

eryday activities using a CNN. Overall, recognition

improvements of up to 23% are possible when comb-

ing the smartphone in the pocket with a single IMU

sensor at the body compared to when the smartphone

was processed exclusively. In addition, it was proven

that high sampling rates are not required for the activ-

ities in this experiment just as large windows sizes.

A possible future study could (1) discover how the

smartphone sensors can be ignored in case the smart-

phone is not worn in the pocket. One of the primary

motivation for the study in this work has been the un-

certainty of smartphone positions in the daily life of

people. The achieved results could be used to record

a dataset with the smartphone on multiple positions

where it might be worn with additional IMUs at one

(or both) of the identiﬁed sensor positions with the

best improvement of recognition performance. The

goal of the study could then be to always provide a

high precision rate by ignoring the smartphone if its

placement does provide irrelevant or even misleading

information. With the Body Location Independent Ac-

tivity Monitoring, there also already exist approaches

trying to solve the problem of different sensor posi-

tions with promising results (Figueira et al., 2016).

To consider the obtained results as valid, (2) fur-

ther activities (such as bicycling) should be investi-

gated. The conducted activities in this study involved

either no or a lot of movements in the lower body.

That could explain the high performance of the an-

kle position. Maybe other activities with predominant

movements in the upper body can be recognized bet-

ter using for example the wrist position. Also, (3)

more subjects from other groups (such as elderly or

people with restricted mobility) should be added to

the dataset.

The current study only investigated one classiﬁer

with a set of optimized hyperparameters. Despite the

high performance of the current model, it would be

worthwhile to (4) explore effects of different classi-

ﬁers and more extensive hyperparameter tuning.

The aim of this work was to test different sensor

placements, but for more intuitive usability (5) a real-

time activity recognition as e.g. in (Andreu et al.,

2011) would be an important step to make real use

out of HAR for medical problems.

ACKNOWLEDGEMENTS

The authors would like to thank all volunteers who

participated in the experiment and made it possible in

the ﬁrst place. This research has been partly funded

by the Federal Ministry of Education and Research of

Germany in the framework of KI-LAB-ITSE (project

number 01IS19066).

REFERENCES

Ahmad, N., Raja Ghazilla, R. A., Khairi, N., and Kasi, V.

(2013). Reviews on various inertial measurement unit

(imu) sensor applications. International Journal of

Signal Processing Systems, 1(2):256–262.

Almaslukh, B., Artoli, A. M., and Al-Muhtadi, J.

(2018). A robust deep learning approach for position-

independent smartphone-based human activity recog-

nition. Sensors (Basel, Switzerland), 18(11).

Altun, K. and Barshan, B. (2010). Human activity recog-

nition using inertial/magnetic sensor units. In Salah,

A. A., Gevers, T., Sebe, N., and Vinciarelli, A., ed-

itors, Human Behavior Understanding, pages 38–51,

Berlin, Heidelberg. Springer Berlin Heidelberg.

Andreu, J., Baruah, R. D., and Angelov, P. (2011). Real

time recognition of human activities from wearable

sensors by evolving classiﬁers. In 2011 IEEE Inter-

national Conference on Fuzzy Systems, pages 2786–

2793. IEEE.

Atallah, L., Lo, B., King, R., and Guang-Zhong, Y. (2011).

Sensor positioning for activity recognition using wear-

able accelerometers. IEEE transactions on biomedical

circuits and systems, 5(4):320–329.

Banos, O., Galvez, J.-M., Damas, M., Pomares, H., and Ro-

jas, I. (2014). Window size impact in human activity

recognition. Sensors, 14(4):6474–6499.

Bao, L. and Intille, S. S. (2004). Activity recognition

from user-annotated acceleration data. In Interna-

tional conference on pervasive computing, pages 1–

17. Springer.

Bayat, A., Pomplun, M., and Tran, D. A. (2014). A study

on human activity recognition using accelerometer

data from smartphones. Procedia Computer Science,

34:450–457.

Bonomi, A. G., Goris, A. H. C., Yin, B., and West-

erterp, K. R. (2009). Detection of type, duration,

SENSORNETS 2021 - 10th International Conference on Sensor Networks

and intensity of physical activity using an accelerom-

eter. Medicine & Science in Sports & Exercise,

41(9):1770–1777.

Bulling, A., Blanke, U., and Schiele, B. (2014). A tuto-

rial on human activity recognition using body-worn

inertial sensors. ACM Computing Surveys (CSUR),

46(3):1–33.

Butterworth, S. (1930). On the theory of ﬁlter ampliﬁers. In

Wireless Engineer (also called Experimental Wireless

and the Wireless Engineer).

Chen, K., Zhang, D., Yao, L., Guo, B., Yu, Z., and Liu, Y.

(2020). Deep learning for sensor-based human activ-

ity recognition: Overview, challenges and opportuni-

ties. arXiv preprint arXiv:2001.07416.

Chen, Y., Wang, J., Huang, M., and Yu, H. (2019). Cross-

position activity recognition with stratiﬁed transfer

learning. Pervasive and Mobile Computing, 57:1–13.

Dehghani, A., Sarbishei, O., Glatard, T., and Shihab, E.

(2019). A quantitative comparison of overlapping

and non-overlapping sliding windows for human ac-

tivity recognition using inertial sensors. Sensors,

19(22):5026.

Figueira, C., Matias, R., and Gamboa, H. (2016). Body lo-

cation independent activity monitoring. In Bahr, A.,

Abu Saleh, L., Schr

oder, D., and Krautschneider, W.,

editors, Integrated 16-Channel Neural Recording Cir-

cuit with SPI Interface and Error Correction Code in

130 nm CMOS Technology, pages 190–197, Hamburg

and Set

ubal. Technische Universit

at Hamburg Uni-

versit

atsbibliothek and SCITEPRESS - Science and

Technology Publications Lda.

Foerster, F., Smeja, M., and Fahrenberg, J. (1999). Detec-

tion of posture and motion by accelerometry: A vali-

dation study in ambulatory monitoring. Computers in

Human Behavior, 15(5):571–583.

Gao, X., Luo, H., Wang, Q., Zhao, F., Ye, L., and Zhang, Y.

(2019). A human activity recognition algorithm based

on stacking denoising autoencoder and lightgbm. Sen-

sors (Basel, Switzerland), 19(4).

Ghosh, A. and Riccardi, G. (2014). Recognizing human

activities from smartphone sensor signals. In Hua,

K. A., editor, Proceedings of the 2014 ACM Confer-

ence on Multimedia, November 3 - 7, 2014, Orlando,

FL, USA, pages 865–868, New York, NY. ACM.

Huynh, D. T. G. (2008). Human Activity Recognition with

Wearable Sensors. PhD thesis, Technische Universit

Darmstadt.

Ignatov, A. (2018). Real-time human activity recognition

from accelerometer data using convolutional neural

networks. Applied Soft Computing, 62:915–922.

Janidarmian, M., Roshan Fekr, A., Radecka, K., and Zilic,

Z. (2017). A comprehensive analysis on wearable ac-

celeration sensors in human activity recognition. Sen-

sors (Basel, Switzerland), 17(3).

Khan, A. M., Lee, Y.-K., Lee, S. Y., and Kim, T.-

S. (2010). A triaxial accelerometer-based physical-

activity recognition via augmented-signal features and

a hierarchical recognizer. IEEE transactions on infor-

mation technology in biomedicine : a publication of

the IEEE Engineering in Medicine and Biology Soci-

ety, 14(5):1166–1172.

Klieme, E., Tietz, C., and Meinel, C. (2018). Beware of

smombies: Veriﬁcation of users based on activities

while walking. In 2018 17th IEEE International Con-

ference On Trust, Security And Privacy In Computing

And Communications/12th IEEE International Con-

ference On Big Data Science And Engineering (Trust-

Com/BigDataSE), pages 651–660. IEEE.

Krishnan, N. C. and Panchanathan, S. (2008). Analysis of

low resolution accelerometer data for continuous hu-

man activity recognition. In 2008 IEEE International

Conference on Acoustics, Speech and Signal Process-

ing, pages 3337–3340. IEEE.

Mannini, A. and Sabatini, A. M. (2010). Machine learning

methods for classifying human physical activity from

on-body accelerometers. Sensors, 10(2):1154–1175.

Mannini, A., Sabatini, A. M., and Intille, S. S. (2015).

Accelerometry-based recognition of the placement

sites of a wearable sensor. Pervasive and Mobile Com-

puting, 21:62–74.

Mantyjarvi, J., Himberg, J., and Seppanen, T. (2001). Rec-

ognizing human motion with multiple acceleration

sensors. In 2001 IEEE International Conference on

Systems, Man and Cybernetics. e-Systems and e-Man

for Cybernetics in Cyberspace (Cat. No. 01CH37236),

volume 2, pages 747–752. IEEE.

Mohamed, R., Zainudin, M. N. S., Sulaiman, M. N., Peru-

mal, T., and Mustapha, N. (2018). Multi-label classi-

ﬁcation for physical activity recognition from various

accelerometer sensor positions. Journal of Informa-

tion and Communication Technology, 17(2):209–231.

Orha, I. and Oniga, S. (2014). Study regarding the opti-

mal sensors placement on the body for human activity

recognition. In 2014 IEEE 20th International Sympo-

sium for Design and Technology in Electronic Pack-

aging (SIITME), pages 203–206. IEEE.

Rickham, P. P. (1964). Human experimentation. code of

ethics of the world medical association. declaration of

helsinki. British medical journal, 2(5402):177.

Rivera, P., Valerezo, E., Choi, M.-T., and Kim, T.-S. (2017).

Recognition of human hand activities based on a sin-

gle wrist imu using recurrent neural networks. Inter-

national Journal of Pharma Medicine and Biological

Sciences, 6(4).

Shoaib, M., Scholten, H., and Havinga, P. (2013). To-

wards physical activity recognition using smartphone

sensors. In Ubiquitous Intelligence and Computing,

2013 IEEE 10th International Conference on and 10th

International Conference on Autonomic and Trusted

Computing (UIC/ATC), pages 80–87. IEEE.

Su, X., Tong, H., and Ji, P. (2014). Activity recognition

with smartphone sensors. Tsinghua Science and Tech-

nology, 19(3):235–249.

Tapia, E. M., Intille, S. S., Haskell, W., Larson, K., Wright,

J., King, A., and Friedman, R. (2007). Real-time

recognition of physical activities and their intensities

using wireless accelerometers and a heart rate moni-

tor. In 2007 11th IEEE international symposium on

wearable computers, pages 37–40. IEEE.

Optimal Sensor Placement for Human Activity Recognition with a Minimal Smartphone–IMU Setup

Valarezo, E., Rivera, P., Park, J. M., Gi, G., Kim, T. Y., Al-

Antari, M. A., Al-Masni, M., and Kim, T. S. (2017).

Human activity recognition using a single wrist imu

sensor via deep learning convolutional and recurrent

neural nets. UNIKOM Journal of ICT, Design, Engi-

neering and Technological Science1, 1:1–5.

Wang, W.-z., Guo, Y.-w., Huang, B.-y., Zhao, G.-r., Liu, B.-

q., and Wang, L. (2011). Analysis of ﬁltering meth-

ods for 3d acceleration signals in body sensor net-

work. In International Symposium on Bioelectronics

and Bioinformations 2011, pages 263–266.

Yang, J., Nguyen, M. N., San, P. P., Li, X. L., and Krish-

naswamy, S. (2015). Deep convolutional neural net-

works on multichannel time series for human activ-

ity recognition. In Twenty-Fourth International Joint

Conference on Artiﬁcial Intelligence.

Zebin, T., Scully, P. J., and Ozanyan, K. B. (2016). Human

activity recognition with inertial sensors using a deep

learning approach. In 2016 IEEE SENSORS, pages

1–3. IEEE.

Zhang, M. and Sawchuk, A. A. (2012). Usc-had: A daily

activity dataset for ubiquitous activity recognition us-

ing wearable sensors. In Proceedings of the 2012

ACM Conference on Ubiquitous Computing, pages

1036–1043.

Zhou, L., Fischer, E., Tunca, C., Brahms, C. M., Ersoy, C.,

Granacher, U., and Arnrich, B. (2020). How we found

our imu: Guidelines to imu selection and a compar-

ison of seven imus for pervasive healthcare applica-

tions. Sensors, 20(15):4090.

Zhu, C. and Sheng, W. (2009). Human daily activity recog-

nition in robot-assisted living using multi-sensor fu-

sion. In IEEE International Conference on Robotics

and Automation, 2009, pages 2154–2159, Piscataway,

NJ. IEEE.

SENSORNETS 2021 - 10th International Conference on Sensor Networks