Design of a Low-false-positive Gesture for a Wearable Device

Ryo Kawahata

, Atsushi Shimada

, Takayoshi Yamashita

, Hideaki Uchiyama

and Rin-ichiro Taniguchi

Graduate School of Information Science and Electrical Engineering, Kyushu University,

744, Motooka, Nishi-ku, 819-0395, Fukuoka, Japan

Faculty of Arts and Science, Kyushu University, 744 Motooka, Nishi-ku, 819-0395, Fukuoka, Japan

Department of Computer Science, College of Engineering Chubu University,

1200, Matsumoto-cho, 487-8501, Kasugai, Aichi, Japan

Faculty of Information Science and Electrical Engineering, Kyushu University,

744, Motooka, Nishi-ku, 819-0395, Fukuoka, Japan

Keywords:

Gesture Recognition, Wearable Device.

Abstract:

As smartwatches are becoming more widely used in society, gesture recognition, as an important aspect of

interaction with smartwatches, is attracting attention. An accelerometer that is incorporated in a device is

often used to recognize gestures. However, a gesture is often detected falsely when a similar pattern of action

occurs in daily life. In this paper, we present a novel method of designing a new gesture that reduces false

detection. We refer to such a gesture as a low-false-positive (LFP) gesture. The proposed method enables

a gesture design system to suggest LFP motion gestures automatically. The user of the system can design

LFP gestures more easily and quickly than what has been possible in previous work. Our method combines

primitive gestures to create an LFP gesture. The combination of primitive gestures is recognized quickly

and accurately by a random forest algorithm using our method. We experimentally demonstrate the good

recognition performance of our method for a designed gesture with a high recognition rate and without false

detection.

1 INTRODUCTION

Wearable devices have become widespread in soci-

ety. Various devices include eyeglass devices (e.g.,

Google Glass) and wristband devices (e.g., Nike+ Fu-

elBand), and in particular, wrist-watch-type devices,

called smartwatches, havebecome increasingly famil-

iar in daily life.

People can use many applications (e.g., email,

map navigation and music player applications) on a

smartwatch. Surface gestures (e.g., tapping, swiping,

and ﬂicking) are often used when manipulating the

applications on a smartphone. However, in the case

of the smartwatch, people are forced to manipulate

the applications on a small touch screen. It has there-

fore become important to develop a new interaction

method such as interaction by motion gesture for ease

of use (Park et al., 2011).

Motion gesture enables more intuitive interaction

than interaction with a keyboard or touch screen be-

cause people only need to perform a simple action

like ﬂicking a wrist. However, an interaction sys-

tem that is based on motion gestures needs to recog-

nize the gestures with a high recognition rate and low

false positive (LFP) rate for users. To recognize ges-

tures, sensors such as an accelerometer contained in

a smartwatch are often used. An interaction system

that is based on motion gestures and used in daily life

faces the problem that the gesture recognizer will ﬁnd

it difﬁcult to distinguish between gestures for opera-

tion of an application and everyday motions.

Figure 1 shows an example of the problem. There

are four designed gestures for the operation of a music

player on a smartwatch. The two gestures of ”Volume

up” and ”Volume down” are detected falsely when the

user is walking because the two gestures are almost

the same as the everyday motion of walking.

There are two main solutions to the problem. One

solution is for the user to press or touch a button be-

fore making gestures so as to segment gestures from

everyday motions. This is an obstacle to intuitive in-

teraction with a smartwatch because the solution re-

quires the user to use both hands to push a button

whenever the user operates applications by gestures.

Kawahata, R., Shimada, A., Yamashita, T., Uchiyama, H. and Taniguchi, R-i.

Design of a Low-false-positive Gesture for a Wearable Device.

DOI: 10.5220/0005701905810588

In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2016), pages 581-588

ISBN: 978-989-758-173-1

581

↑

False positive

(a) (b)

Volume up

Volume down

Next song

Previous music

Figure 1: (a) Gestures for operation of an application, (b)

everyday motion (walking).

The other approach is to use uncommon gestures;

i.e., gestures with sensor patterns that do not appear

frequently in daily motions. These gestures are re-

ferred to as LFP gestures. Speciﬁcally, a certain ges-

ture is used to indicate the beginning and end of ges-

tural input such as in the case of a delimiter (Ruiz and

Li, 2011) or used as a gesture for operation of an ap-

plication directly (Ashbrook and Starner, 2010). This

approach does not require a user to press a button,

but LFP gestures tend to be complex because simple

actions are often part of daily motions. Convention-

ally, interaction designers carefully design LFP ges-

tures by analyzing daily motions and by considering

the situations in which motion gestures are used. In

addition, gestural input depends on the applications.

The design of LFP gestures thus remains difﬁcult.

In this paper, we propose a method of suggest-

ing LFP gestures automatically. Our method searches

LFP patterns of simple gestures in daily motions and

suggests LFP gestures to the system user. A simple

action is referred to as a primitive gesture in this pa-

per. The combination of simple actions reduces the

LFP rate. Additionally, the LFP gesture suggested by

our system does not restrict intuitive gesture interac-

tion because of its use of simple actions. In fact, in

Section 4.3, we experimentally demonstrate that one

simple action happens more frequently than two suc-

cessive primitive gestures in daily motions. The de-

tails of our method are given in Section 3.

There are two kinds of users of our system: an in-

teraction designer and a gesture user. Interaction de-

signers design the application interface and use ges-

tures for the interface. They consider the situation of

using an application and apply gestures to application

commands on the basis that there are no false detec-

tions in long motions of the situation (lasting more

than 1 week). Meanwhile, gesture users operate the

application in practice using gestures. They apply

gestures to application commands manually for ease

of use on the basis that there are no false detections

in daily motions (lasting about 1 day). We present ex-

periments assuming a gesture user as our system user

in Section 4.

2 RELATED WORK

Gesture recognition is an active area of research

on human–computer interaction (Mitra and Acharya,

2007). In particular, the recognition of hand gestures

has become more pervasive and has a wide range of

applications such as the recognition of sign language

(Zafrulla et al., 2011) and an interaction system for

surgery (Ruppert et al., 2012).

There are two approaches for recognizing hand

gestures: the use of vision-based methods and the use

of sensor-based methods. A vision-based method rec-

ognizes hand gestures to detect hand motions or hand

shapes using an RGB camera (Chen et al., 2007). This

method is based on image processing that segments

the hand area in the image. Segmentation of a hand

gesture is easily affected by illumination variations

and the positional relation between the camera and

hand, which is a large limitation in the case of a mo-

bile environment.

In contrast, a sensor-based method often uses an

accelerometer to recognize gestures (Schl¨omer et al.,

2008). Such methods have received much attention

with the widespread use of smartphones and wear-

able devices that incorporate accelerometers and gy-

roscopes. In practice, a sensor-based method is ap-

plied to operate a smartphone (Ruiz et al., 2011) and

smartwatch (Park et al., 2011). Conventional recogni-

tion methods using acceleration often focus on man-

ually segmented gestures to avoid false gesture de-

tection (Akl et al., 2011). However, considering the

continuous gesture is important for real-time applica-

tion. In handling this false-detection problem, previ-

ous research has required the user to press a button to

notify the system of gesture input (Liu et al., 2009).

In a wearable environment, pressing a button both-

ers the user because it requires the user to use both

hands. Another method of solving the false-detection

problem is improving the detector performance us-

ing a threshold. This method assumes that there is

a difference between the gesture and daily movement,

such as a difference in movement speed (Park et al.,

2011). The start point of a gesture is the time at which

the processed sensor value ﬁrst exceeds the threshold.

The method of using a threshold cannot deal with the

problem that motion patterns that are similar to the

gesture happen by chance during daily motion.

Another interesting method is to use an LFP ges-

ture. An LFP gesture is designed on the basis that the

gesture rarely appears in daily motions. This method

allows gesture interaction without pressing a button

ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods

582

or the false detection of gestures. Ruiz et al. de-

signed LFP gestures for mobile interaction using a

motion gesture delimiter called Doubleﬂip (Ruiz and

Li, 2011). Doubleﬂip is user-friendlybecause the ges-

ture consists of a combination of simple actions. Ruiz

et al. evaluated the true positive rate and false posi-

tive rate of a gesture for 2100 hours of motion data.

Considering the manipulation of an application, types

of gesture depend on the application and situation.

Therefore, designing an LFP gesture is a difﬁcult task

for the gesture designer, who frequently needs to cre-

ate gestures for new applications and to determine the

LFP rates of the gestures.

Ashbrook et al. proposed a design tool for the cre-

ation of LFP gestures (Ashbrook and Starner, 2010).

The tool calculates the false positive rate of an input

gesture from daily motion. The user of the tool can

easily discriminate whether the input gesture will be

detected falsely or not in daily motion. However, the

user is required to repeat the design and input of ges-

tures many times to ﬁnd LFP gestures. Designing an

LFP gesture thus remains difﬁcult.

Kohlsdorf et al. proposed a new gesture design

tool that facilitates the design of an LFP gesture.

Their system suggests an LFP gesture automatically

from input daily motion. Employing their method,

daily motion is replaced by symbol sequences and a

low-false-rate gesture is created by ﬁnding a symbol

sequence that does not appear frequently in the input

daily motion. Their system is limited to surface ges-

tures, which are two-dimensional gestures on a touch

pad, because of the restoration from the symbol se-

quence to gesture.

We propose a primitive-based gesture creation

method for a gesture suggestion system. Our pro-

posed method can suggest motion gestures for the

system user using information of primitive gestures.

A primitive-based method is used in the recognition

of sign language (Bauer and Kraiss, 2002) and activ-

ity recognition (Zhang and Sawchuk, 2012).

3 PROPOSED METHOD BASED

ON PRIMITIVE GESTURES

3.1 System Overview

We propose a method of searching and suggesting

LFP motion patterns for a system that creates LFP

gestures automatically. Figure 2 presents the system

scenario. The system scenario of gesture creation is

inspired by a system made by (Kohlsdorf and Starner,

2013) but differs in the way that LFP motion patterns

Extract movements

and preprocessing

Measure daily motion

Input daily motion

Select gestures

Designer System

Matching between movements

and primitive gestures

Exploring primitive sequences

Visualizing gestures

by gestures Information

UP_ROLL

Proposed method

Figure 2: System scenario.

are searched for and suggested. While they uses sym-

bol sequence for searching LFP motion patterns, our

method searches for and suggests LFP motion pat-

terns by considering the combination of primitiveges-

tures.

There are a huge number of hand motion patterns

in daily motion when we take into account all hand

positions, directions, and movements. It is therefore

difﬁcult to ﬁnd LFP patterns concretely because of the

computational cost. To ﬁnd LFP patterns, we make

one assumption about the LFP gesture. The assump-

tion is that the LFP gesture is a combination of primi-

tive gestures that are rarely detected in input daily mo-

tion. Hand motion is represented by a limited number

of motions and the system can explore LFP patterns

according to the assumption.

We here introduce the ﬂow of LFP gesture cre-

ation. First, a system user measures daily motions

using sensors in a smartwatch and inputs the daily

motions to our system. Our system runs a low-pass

ﬁlter over input daily motions and extracts periods of

high accelerometer values from the daily motions to

eliminate periods in which there is no hand motion.

Next, extracted data are matched with primitive ges-

tures and a sequence of primitive gestures (i.e., the

primitive sequence) is expressed. The proposed sys-

tem counts the number of primitive sequences in the

daily motions and ﬁnds primitive sequences that have

low occurrence in the daily motions. Finally, the sys-

tem gives primitive sequences and the user selects

those that the user wants to use for application.

3.2 Design of a Primitive Gesture

Suggesting gestures to the system user requires the

reconstruction of hand motions from sensor values,

Design of a Low-false-positive Gesture for a Wearable Device

583

RIGHT LEFT

DOWN

PULL

PUSH

ROLL

Figure 3: Primitive gesture.

which is difﬁcult because sensor data such as ac-

celerometer data lose motion information of the hand

position and direction. Generally, multiple sensors

such as those of a motion capture system are used

in reconstruction and a complicated and sophisticated

hand tracking method is required.

The proposed method uses information of primi-

tive gestures for the reconstruction. Primitive gestures

are components of motion gestures. In previous re-

search, primitive gestures have been constructed em-

ploying an unsupervised clustering algorithm (Zhang

and Sawchuk, 2012) (Bauer and Kraiss, 2002). First,

sensor data are divided into a sequence of ﬁxed-

length-window cells (i.e., segments) and the feature

vector for each segment of the sequence is calculated.

Segments are then clustered according to their feature

vectors and the center of a cluster is taken as a primi-

tive gesture. As a result, vocabulary size of a prim-

itive gesture depends on the cluster obtained from

sensor data. It is inconvenient to suggest a certain

LFP gesture because it cannot be expected to emerge

from primitive gestures. Therefore, in our method,

the primitive gesture is deﬁned in advance by ourself.

The use of predeﬁned primitive gestures allows us to

ﬁnd certain motions from sensor data and we can thus

represent sensor data with the predeﬁned motions. As

a result, the proposed method can reconstruct a se-

quence of predeﬁned motions from sensor data. Fur-

thermore, it can reconstruct hand motions more eas-

ily with only one accelerometer sensor than a motion

capture system.

Figure 3 shows seven primitive gestures for our

proposed method. These primitive gestures consist of

simple and short movements so as to avoid motion

complexity when primitives are combined. The sen-

sors are oriented upwards because of visual feedback.

3.3 Preprocessing

Sensor data include much noise around high-

frequency components, which is an obstacle to

achieving high recognition performance. We adopt

the weighted moving average to smooth the sensor

data.

In our case, it is desirable only to handle data of

hand movement (what we call the movement area) in

daily motions. The recording of daily motion involves

the collection of much data but no predeﬁned move-

ment area, and treating all data is thus a waste of com-

putational time. We extract the movement area using

threshold-based method. Let A = (a

, a

, ..., a

) de-

note the time series of acceleration and a = (a

, a

)

denote acceleration. We evaluate the amplitude of

movement G = (G

, G

) by comparison between

two observations; a

and a

i−N

G = |a

− a

i−N

| (1)

The extraction of the movement starts when G

or G

is higher than the threshold at the start point,

. The end point of the extraction is decided by two

threshold; one is about the G and the other is about

the time domain. In our method, we handle continu-

ous gestures like the primitive sequences. Therefore,

we set a temporal threshold T

to ending point of the

extraction not to split the continuous gestures. The

extraction ends when G

, G

and G

are smaller than

for a period of T

. This extracted area by the

thresholds is called extracted period in this paper.

It is desirable to normalize the sensor data in han-

dling the variability of gestures. Measured accelera-

tion consists of two components: a dynamic compo-

nent and gravitational component. The dynamic com-

ponent relates to movement while the gravitational

component relates to the change in tilt of the device.

The variability of the sensor tilt affects recognition

performance. The mean of the measured acceleration

on each axis is the best estimate gravitational com-

ponent value. We normalize the measurement data

extracted via the threshold method by subtracting the

mean from the data.

3.4 Feature Representation

The proposed method is similar to a bag-of-features

method (Zhang and Sawchuk, 2012) when extracting

features from the data of an accelerometer. The pro-

posed method calculates the gradient of acceleration

as a feature. The calculation ﬂow is shown in Fig-

ure 4(a). First, as shown in Figure 4(a-1), the pro-

posed method separates sensor data, extracted by a

time window, into subsequences. The length of a sub-

sequence is l

and subsequences are extracted with

shifting size l

. Next, the gradient of accelerometer

data is calculated for each subsequence and quantized

into 5 levels as shown Figure 4(a-1). Then, a gradi-

ent histogram is made as shown in Figure 4(a-2). The

proposed method divides a set of subsequences into

sub-windows and produces a histogram for each

sub-window. Generally, a bag-of-features method ig-

nores the order of observation, it causes confusion of

ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods

584

(a)Feature Calculation

(b)Matching

(a-2)Generation of histogram

(a-1)Calculation of gradient

and quantization

-1

-2

䠙

Subsequence

Sub-window

UP ROLL

(b-1)Current recognition (b-2)Next recognition

Time windows

Extracted period

r[i]

r[i+1]

r[i+2]

Figure 4: Feature calculation and matching between daily

motion and primitive gestures.

movements such as LEFT and RIGHT. To solve this

problem, the proposed method create a histogram in

each sub-window. Finally, the histograms are con-

catenated to represent a feature vector.

3.5 Matching between Daily Motion

and Primitive Gestures

The proposed method employs a time-series match-

ing method for mapping between daily motion and

primitive gestures. Dynamic time warping (DTW)

is a general approach for time-series matching (Liu

et al., 2009) (Akl et al., 2011) and allows us to cal-

culate the distance between two temporal sequences,

which may differ in length. DTW attempts to match

all training samples one by one and has a high compu-

tational cost. It thus takes a long time to match daily

motion measured over a long time and primitive ges-

tures.

The proposed method uses the random forest al-

gorithm (Liaw and Wiener, 2002) to reduce compu-

tational cost. The random forest is a method of en-

semble learning for multiple classiﬁcation. Multiple

decision trees constitute a random forest and they are

trained to control variance. In the testing phase, the

random forest algorithm uses a discriminant function

obtained in the training phase to map between daily

motion and primitive gestures at high speed.

There are often variations between training and

testing samples in the direction of the time axis.

To handle these variations, we generate new train-

ing samples to expand, shrink and shift the original

training samples along the time axis. Original train-

ing samples are expanded by linear interpolation and

shrunk by decimating samples at regular intervals.

A matching between daily motion and primitive

gestures is sequentially performed. The matching al-

gorithm is shown in Figure 4(b). To handle the varia-

tion of gesture length, we set up several sizes of time

windows for matching. A time window consists of

subsequences deﬁned in Section 3.4, so that a feature

vector of each time window is represented by a con-

catenated histogram given in Figure 4(a). To simplify

the explanation, we denote w

as a time window, and

its length as |w

|. For each time window w

, we ﬁrstly

acquire a candidate of primitive gesture by the high-

est matching probability of class c. Then, we select

a window ˆw

which has the highest matching proba-

bility in the all windows, and regards the class label

of ˆw

as the recognition result. To achieve sequential

recognition, we have to deﬁne the start point of recog-

nition according to the previous recognition process-

ing. Let r[i] be the start point of current recognition,

shown in Figure 4(b-1), and the issue is to set the start

point of next recognition r[i+ 1], given in Figure 4(b-

2). As explainedabove, we acquire the recognition re-

sult for r[i] as c recognized in ˆw

, therefore, the time

length of ˆw

is simply added to r[i] to start the next

recognition.

r[i+ 1] = r[i] + | ˆw

| (2)

We repeat this sequential recognition processing

times by updating r[i]. For instance, if we would

like to recognize two successive primitive gestures,

we have to set the l

to be 2.

4 EXPERIMENT

In this section, we report two experiments for evalua-

tion of recognition performance and true positive and

false positive rates of gestures created by our system.

First, we investigate primitive patterns searched for

by our system from daily motions in our laboratory

and discuss characteristics of the gestures. Next we

compare the proposed method with the DTW method

in terms of their performance in recognizing gestures

obtained in the ﬁrst experiment.

4.1 Dataset and Parameters

In this experiments, we measured daily motions in our

laboratory. These daily motions included hand mo-

Design of a Low-false-positive Gesture for a Wearable Device

585

(a) (b)

Figure 5: (a) Accelerometer and sensor axis, (b) sensor po-

sition.

tions made while, for example, using a computer, eat-

ing a meal, reading and writing, and walking. The

major activity of the daily motion was the use of the

computer.

We used the accelerometer shown in Figure 5,

which made measurements at 50 Hz. This wireless

sensor can record sensor data in internal memory and

work continuously for 4 hours. As shown in Figure 5,

we attached this sensor to the forearm, as if we were

using a smartwatch.

The daily motion in the laboratory was measured

for one subject on separate days. The subject was

instructed not to use primitive gestures deliberately.

The total measurement time was 24 hours. In terms

of the primitive gestures, we collected 20 samples per

gesture for training the random forest algorithm.

The parameters for preprocessing was set

= 0.1G, Th

= 0.05G, T

= 0.7s, N = 5

in our experiment. In terms of the match-

ing parameters, we empirically set l

= 6,

= 1, n

= 4, l

= 2, {|w

|, |w

|, ...} =

{10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50} in this

paper.

4.2 Comparative Approach

The proposed method replaces measurement data

with primitive sequences to search for LFP patterns.

Additionally, gestures created by our system are rec-

ognized by matching between measurement data and

primitive gestures. From the above, the searched pat-

terns and performance of recognition of the created

gestures depend on the matching method.

We used two kinds of DTW method for matching.

The ﬁrst method is conventional DTW.

Index

Query value

0 20 40 60 80 100

−0.2 0.4 0.8

Index

Query value

0 20 40 60 80 100

−0.2 0.2 0.6 1.0

(a) (b)

Figure 6: Data mapping between two time series of data:

(a) conventional DTW, (b) open-end DTW.

Count

Primitive sequence

Single

Combination

Count

Primitive sequence

Figure 7: Top six primitive sequences.

This method ﬁxes the start and end points of the cal-

culated distance. We used the matching path and local

distance from previous research (Liu et al., 2009) for

a comparative approach. When matching between ex-

tracted period and primitive sequences, a sliding win-

dow is used. This window size is estimated by the

mean of the length of primitive gestures as training

samples and strides at intervals whose size is half the

window size.

The second method is open-end DTW. This

method can perform partial matching because the end

point is ﬂexible, and it is thus often used for contin-

uous word recognition. See (Mori et al., 2006) (Oka,

1998) for more information. We used the matching

path from previous research (Mori et al., 2006) and

the same local distance as previously used.

An example of the difference in matching between

the two methods described above is shown in Figure

6. The conventional DTW method maps one time se-

ries of data to the other overall under a constraint.

In contrast, open-end DTW can ﬁnd one in the other

more suitable.

4.3 Characteristics of Created Gestures

We investigate the primitive sequences searched for

by our system and count the primitive gestures in

daily motions in the laboratory. The maximum length

of a primitive sequence l

is set to two primitive ges-

tures because the duration of one primitive gesture is

about 0.8 seconds and a duration longer than three

lengths of a primitive gesture (longer than 2.4 sec-

onds) is a burden on the user.

The top six gestures in terms of the count are

shown in Figure 7. We rejected a pattern if the pattern

was dissimilar to all primitive gestures by a thresh-

Table 1: Primitive sequences not appearing in daily motion.

DOWN DOWN, RIGHT ROLL, ROLL RIGHT

LEFT ROLL, ROLL DOWN, ROLL ROLL

PULL ROLL, ROLL LEFT, ROLL UP

PULL UP, ROLL PULL, UP PUSH

PUSH ROLL, ROLL PUSH, UP ROLL

ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods

586

old amount. Therefore, the absolute number of occur-

rences of primitive gestures depended on the thresh-

old. In this case, there was a large number of single

primitive patterns. As a result, a simple primitive pat-

tern was detected falsely more often than a combina-

tion of primitive patterns if we use such a primitives

as an input gesture for a system.

Meanwhile, some primitive sequences did not ap-

pear in the daily motions. Table 1 gives primitive se-

quences that none of all of the three recognition meth-

ods observed. These primitive sequences often in-

cluded the ROLL gesture. The ROLL gesture is thus

resistant to false detection.

Table 2: Time required to search for primitive sequences

and the LFP rate for daily motions over a period of 24 hours.

Conventional DTW Open-End DTW Random Forest

119[s] 809[s] 67[s]

The time required to search for primitive se-

quences on a laptop computer having an Intel Core i7

2.8-Hz CPU with 8 GB RAM is given in Table 2. The

random forest algorithm is fastest and can search for

LFP patterns 10 times as fast as the open-end method.

4.4 Accuracy

We evaluated the performance of recognition of prim-

itive sequences for each recognition method. The

recognition of primitive sequences by our system

should be at a high recognition rate and LFP rate

for users. We selected ROLL

ROLL gestures and

ROLL gestures for the evaluations on the basis

of the results presented in Section 4.3. A precision–

recall curve was used for this evaluation. We prepared

other daily motions in our laboratory for a period of 4

hours for evaluation.

When recognizing a speciﬁed primitive sequence,

the proposed method calculates the evaluation value

(distance or similarity) for the extracted period and

primitive gesture only correspond to the one. For ex-

ample, when we speciﬁed the UP

ROLL gesture, the

evaluation value is only calculated for the extracted

period and UP gesture as a ﬁrst primitive. The ROLL

gesture is then used to calculate the evaluation value

for a second primitive. The evaluation values are then

summed and the mean of the evaluation value is esti-

mated. Finally, the primitive sequence is recognized

using a threshold of the mean value. In our case,

the estimated primitive sequence does not always cor-

respond to a speciﬁed sequence when the threshold

is adjusted correctly because the lengths of primi-

tive sequences are different when the recognizer er-

roneously ﬁnds primitive gestures in the primitive se-

0.0 0.4 0.8

Recall

Precision

0.0 0.4 0.8

Recall

Precision

Random Forest

Conventional DTW

Open−End DTW

(a)UP_ROLL (b)ROLL_ROLL

Figure 8: Precision–recall curve of UP ROLL and

ROLL

ROLL gesture.

ROLL ROLL

ROLL

(a) (b)

Figure 9: Example of false recognition: (a) ground truth,

(b) estimated result.

quences. Therefore, recall is not always 1 when using

the threshold.

The results of the recognition performance of

ROLL

ROLL gestures and UP ROLL gestures are

shown in Figures 8(a) and 8(b). The random forest

algorithm and open-end DTW had the best thresh-

olds for the recognition of speciﬁed gestures at a high

recognition rate with no false detection. Conventional

DTW performed worse.

4.5 Discussion

Although our system often searched for LEFT and

RIGHT gestures, which are simple actions, from daily

motions in our laboratory, the ROLL gesture was not

detected frequently. The daily motions include many

actions relating to using a computer mouse. When us-

ing a mouse, a hand moves horizontally on a desk. As

a result, primitive gestures such as LEFT and RIGHT,

which include hand movement parallel to the ground,

were often detected.

There were some cases that two successive ges-

tures were mistakenly recognized as single gesture.

We show an example of false recognition in Figure

9. In this example, the ROLL

ROLL gesture is rec-

ognized as a ROLL gesture falsely by the recognizer.

Employing our method, the recognizer uses the gradi-

ent of acceleration. The outline of this ROLL

ROLL

gesture is similar to that of the ROLL gesture in terms

of this feature. This problem is solved if we adjust

the size of window used to extract sensor data when

Design of a Low-false-positive Gesture for a Wearable Device

587

matching.

The presented experiments demonstrate that the

random forest algorithm has recognition performance

similar to that of open-end DTW at high computa-

tional speed. The matching speed is important not

only for intuitive interaction but also for the usability

of our system. Practically, the dataset of daily mo-

tion will be longer than 24 hours in some cases. The

random forest algorithm is suitable for our system de-

signed to ﬁnd optimal gestures for certain applications

and situations quickly.

5 CONCLUSION AND FUTURE

WORK

For intuitive interaction with wearable devices, ges-

ture recognition has advantages over traditional meth-

ods such as gestures on a touch pad. In terms of rec-

ognizing gestures correctly for a smartwatch, the false

positiveness of gestures is a big problem.

We proposed a primitive-based gesture recogni-

tion approach to solve the problem. This approach

creates new gestures that are resistant against false

detection in daily motions. We assume one system

for LFP gesture creation. This system records daily

motion data from users and searches for LFP patterns

in the daily motions employing our proposed method.

The system searches for and visualizes LFP motion

gestures by focusing on primitive gestures.

In future work, we will continue to evaluate our

proposed method for multiple people and investigate

a way of visualizing primitive sequences though the

evaluation. In addition, we will verify the validity of

our method for seven primitive gestures.

REFERENCES

Akl, A., Feng, C., and Valaee, S. (2011). A

novel accelerometer-based gesture recognition sys-

tem. IEEE TRANSACTIONS ON SIGNAL PROCESS-

ING, 59(12):6197.

Ashbrook, D. and Starner, T. (2010). Magic: a motion

gesture design tool. In Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems,

pages 2159–2168. ACM.

Bauer, B. and Kraiss, K.-F. (2002). Video-based sign

recognition using self-organizing subunits. In Pattern

Recognition, 2002. Proceedings. 16th International

Conference on, volume 2, pages 434–437. IEEE.

Chen, Q., Georganas, N. D., and Petriu, E. M. (2007).

Real-time vision-based hand gesture recognition us-

ing haar-like features. In Instrumentation and Mea-

surement Technology Conference Proceedings, 2007.

IMTC 2007. IEEE, pages 1–6. IEEE.

Kohlsdorf, D. K. H. and Starner, T. E. (2013). Magic sum-

moning: towards automatic suggesting and testing of

gestures with low probability of false positives dur-

ing use. The Journal of Machine Learning Research,

14(1):209–242.

Liaw, A. and Wiener, M. (2002). Classiﬁcation and regres-

sion by randomforest. R news, 2(3):18–22.

Liu, J., Zhong, L., Wickramasuriya, J., and Vasudevan, V.

(2009). uwave: Accelerometer-based personalized

gesture recognition and its applications. Pervasive and

Mobile Computing, 5(6):657–675.

Mitra, S. and Acharya, T. (2007). Gesture recognition:

A survey. Systems, Man, and Cybernetics, Part

C: Applications and Reviews, IEEE Transactions on,

37(3):311–324.

Mori, A., Uchida, S., Kurazume, R., Taniguchi, R.-i.,

Hasegawa, T., and Sakoe, H. (2006). Early recogni-

tion and prediction of gestures. In Pattern Recogni-

tion, 2006. ICPR 2006. 18th International Conference

on, volume 3, pages 560–563. IEEE.

Oka, R. (1998). Spotting method for classiﬁcation of real

world data. The Computer Journal, 41(8):559–565.

Park, T., Lee, J., Hwang, I., Yoo, C., Nachman, L., and

Song, J. (2011). E-gesture: a collaborative archi-

tecture for energy-efﬁcient gesture recognition with

hand-worn sensor and mobile devices. In Proceedings

of the 9th ACM Conference on Embedded Networked

Sensor Systems, pages 260–273. ACM.

Ruiz, J. and Li, Y. (2011). Doubleﬂip: a motion gesture

delimiter for mobile interaction. In Proceedings of the

SIGCHI Conference on Human Factors in Computing

Systems, pages 2717–2720. ACM.

Ruiz, J., Li, Y., and Lank, E. (2011). User-deﬁned motion

gestures for mobile interaction. In Proceedings of the

SIGCHI Conference on Human Factors in Computing

Systems, pages 197–206. ACM.

Ruppert, G. C. S., Reis, L. O., Amorim, P. H. J., de Moraes,

T. F., and da Silva, J. V. L. (2012). Touchless ges-

ture user interface for interactive image visualiza-

tion in urological surgery. World journal of urology,

30(5):687–691.

Schl¨omer, T., Poppinga, B., Henze, N., and Boll, S. (2008).

Gesture recognition with a wii controller. In Proceed-

ings of the 2nd international conference on Tangible

and embedded interaction, pages 11–14. ACM.

Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., and

Presti, P. (2011). American sign language recogni-

tion with the kinect. In Proceedings of the 13th inter-

national conference on multimodal interfaces, pages

279–286. ACM.

Zhang, M. and Sawchuk, A. A. (2012). Motion primitive-

based human activity recognition using a bag-of-

features approach. In Proceedings of the 2nd ACM

SIGHIT International Health Informatics Symposium,

pages 631–640. ACM.

ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods

588