TOWARDS AFFORDABLE GESTURE BASED INTERFACES

An Exploration with Wii Remotes

Amy Ciavolino

CSEE, CoEIT, University of Maryland, Baltimore County, U.S.A.

Camille Marvin

Department of Computer Science, Harvey Mudd College, Claremont, CA, U.S.A.

Jason Creighton, James Coddington, Hans-Peter Bischof, Reynold Bailey

Rochester Institute of Technology, Henrietta, NY, U.S.A.

Keywords:

Natural user interface, Gesture control, Wii remote, User interaction.

Abstract:

The traditional keyboard and mouse computer interface is well suited for 2D applications such as document

editing, but as 3D environments become more prevalent there arises a need for new methods of user input.

An immersive 3D interface is preferable, but is often cost prohibitive. This paper presents the design and

implementation of a Natural User Interface (NUI), named NuWii. NuWii was designed to be used with the

Spiegel visualization framework for astrophysical data. The interface makes use of Nintendo Wii Remotes

as infrared tracking cameras to detect 3D gestures made by the user. These gestures are interpreted and used

to control the viewing camera’s parameters throughout the course of a visualization. Our project provides a

new, more intuitive way of manipulating the camera in the Spiegel visualization system. Furthermore, NuWii

provides an expandable base that could be used for gesture control in many other applications.

1 INTRODUCTION

Spiegel is a visualization system that was developed

to process and visualize large multidimensional data

from simulations of galactic events such as black-hole

mergers, event horizons, and gravity waves (Bischof,

2010; Bischof et al., 2006). Previous methods of user

input for working with 3D models in the Spiegel vi-

sualization framework were not intuitive. Ideally, we

want astrophysicists and other users to be able to view

and interact with the simulations in a simple and nat-

ural way. To accomplish this, we created NuWii, a

system that captures the user’s motions in 3D and uses

them to control the camera position in Spiegel. NuWii

is designed to be easily expandable to other applica-

tions that require 3D gesture input. Our implementa-

tion uses a two level gesture hierarchy to accommo-

date custom gesture input. We designed NuWii to be

portable, easy to set up, and affordable. NuWii uses

two Nintendo Wii Remotes to capture the gesture in-

put from the user. We have also developed an algori-

thm to extract the 3D point from the two images ac-

quired by the Wii Remotes instead of using propri-

etary software in order to keep the cost down for any-

one expanding upon our project.

2 PRIOR WORK IN THE FIELD

A signiﬁcant amount of work has been done in the

area of human-computer interaction, 3D point recog-

nition, and natural user interfaces. A 3D hand recog-

nition system was presented in 2009 (Wang and

Popovi

c, 2009) and efforts are under way to develop a

3D user interface similar to the one seen in the movie

Minority Report (Underkofﬂer, 2010). There has even

been some previous experimentation with 3D interac-

tion in Spiegel (Bak, 2004). Several projects have

explored the use of Wii Remotes for stereo-vision

(Dehling, 2008), motion capture (Wang and Huang,

2008), and ﬁnger tracking (Lee, 2008). Most of the

projects done with Wii Remotes use MATLAB for ca-

224

Ciavolino A., Marvin C., Creighton J., Coddington J., Bischof H. and Bailey R..

TOWARDS AFFORDABLE GESTURE BASED INTERFACES - An Exploration with Wii Remotes.

DOI: 10.5220/0003322102240229

In Proceedings of the International Conference on Imaging Theory and Applications and International Conference on Information Visualization Theory

and Applications (IVAPP-2011), pages 224-229

ISBN: 978-989-8425-46-1

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

mera calibration and resolving points in 3D space.

However, using MATLAB limits the audience, afford-

ability, and ﬂexibility of the software. Some previous

papers tracked multiple points in 3D, but did so only

in the context of head tracking (Cuypers et al., 2009).

Head tracking assumes a limited range of motion of

the points, making it easier to distinguish them. Other

projects involved only minimal error checking (Hay

et al., 2008). In contrast, NuWii is able to track two

points under more general conditions.

3 HARDWARE

We elected to use Nintendo Wii Remote for its afford-

ability, availability, and specialized camera hardware.

In NuWii, the cameras on the front of the Wii Re-

motes pick up infrared light reﬂected off the user’s

ﬁnger tips. Two arrays of infrared LEDs supply the

infrared light and ﬁnger slips, made from reﬂective

tape, reﬂect it back to the cameras. We also machined

a wooden board to hold the Wii Remotes in place. The

notches in our board hold the Wii Remotes two feet

apart and angled 22.5 degrees inward, but any reason-

able angle with overlapping ﬁelds of view could be

used. All the hardware used in NuWii is available to

the average consumer. A photograph of the NuWii

setup is shown in Figure 1.

Figure 1: Photograph the NuWii setup. The user’s gestures

are being tracked by the two infrared cameras.

3.1 The Wii Remote

The Wii Remote has a built-in 128x96 monochrome

camera. By using sub-pixel analysis, it can track up to

four separate points of infrared light and return their

coordinates in a 1024x768 range (Wii Brew, 2010).

The ﬁeld of view of the camera is 33.75 degrees ver-

tical and 45 degrees horizontal. These values can be

derived from the resolution of the camera (Lee, 2008).

The Wii Remotes use the Bluetooth HID proto-

col to communicate with their host. However, they do

not use the standard data types and are meant only to

communicate with a Wii gaming system. This lack of

complete compliance makes connecting the Wii Re-

motes to a computer somewhat complicated, but once

the connection has been made we have found it to be

stable. A great number of libraries in a variety of high

level languages have been written to facilitate easier

communication with the Wii Remotes and their pe-

ripherals (Wii Brew, 2010). We chose the motej li-

brary because it is open source, allowing us a greater

understanding of how it communicates with the Wii

Remotes. Additionally, both Spiegel and motej are

written in Java, making integration straightforward.

3.2 Infrared LED Arrays

Each Wii Remote is surrounded with infrared LEDs

that supply light to be reﬂected back to the cameras.

For each Wii Remote, we made an array of 48 LEDs.

The wavelength of the LEDs is 940 nm which is opti-

mal for the Wii Remotes’ cameras (Wii Brew, 2010).

The LEDs are arranged in eight groups wired in paral-

lel. Each group contains 6 LEDs and a 75 Ohm resis-

tor in series. The LED arrays are powered by a 12V,

1 amp power supply.

3.3 Finger Slips

We created two ﬁnger slips out of reﬂective tape that

easily slide on and off the user’s index ﬁnger and

thumb (Figure 2). The ﬁnger slips completely cover

the tips of the user’s ﬁngers to ensure that the light

from the LEDs is reﬂected back to the cameras from

any angle. We built the ﬁnger slips out of 3M 3000X

Very High Gain Reﬂective Tape. In order to hold

the reﬂective tape together we used small pieces of

Nathan 3M Reﬂective tape. We also built ﬁnger slips

completely out of Nathan 3M Reﬂective tape. These

ﬁnger slips seemed to work just as well and were sig-

niﬁcantly less expensive than their high gain counter-

parts. Nathan 3M Reﬂective tape is also more ﬂexible

which made it easier to form the top of the ﬁnger slips.

Figure 2: Finger slips made of reﬂective tape. Nathan 3M

tape (index ﬁnger) and Very High Gain tape (thumb).

TOWARDS AFFORDABLE GESTURE BASED INTERFACES

- An Exploration with Wii Remotes

225

4 STEREO-VISION

To locate a point in 3D, data from two or more Wii

Remotes must be combined. NuWii uses two Wii Re-

motes. A third Wii Remote could be added to improve

accuracy, but it would require a more advanced setup

and increase the overall cost of the system. Our al-

gorithm uses trigonometry to ﬁnd the location in 3D

space, given the angle of the Wii Remotes and their

distance apart.

4.1 Algorithm

The algorithm takes input from two Wii Remotes.

The output from the Wii Remotes is in the form of

(X, Y) points in the range (0, 0) to (768, 1024). The

algorithm assumes that the Wii Remotes are in the

same y and z plane and are placed at known angles in

the x plane. Deviating from the speciﬁed orientation

will produce errors in the ﬁnal result. It is possible to

get usable output from the algorithm without know-

ing the distance between the Wii Remotes. However,

if the distance is known then the ﬁnal output will have

the same unit as the distance. Steps 1-3 are repeated

for the input from both the left and right cameras.

Step 1. The range of the camera input is altered to

go from (-512, -384) to (512, 384). This is

done by subtracting half the maximum val-

ues for the respective axis (Figure 3).

Figure 3: Transforming the setup coordinates.

Step 2. Using the altered input, the points are nor-

malized by dividing the x value by 512, and

the y value by 384. The x result is then mul-

tiplied by half the horizontal ﬁeld of view of

the camera and the y result is multiplied by

half the vertical ﬁeld of view in order to ﬁnd

the angles shown in Figures 4 and 5.

θ =

512

(1)

φ =

384

33.75

(2)

Step 3. Trigonometry is used to ﬁnd a normalized

point one unit away from the cameras.

= tan(θ) (3)

Figure 4: Top View.

= tan(φ) (4)

= 1 (5)

Step 4. After the normalized points are found, they

are rotated by α, the angle between the Wii

Remotes and the y-z plane shown in Fig-

ure 6. Rotating the points is not required if

the Wii Remotes are parallel. The following

equations are used to rotate the points:

= X

cos(α) + sin(α) (6)

= Y

(7)

= −1 ∗ X

sin(α) + cos(α) (8)

Step 5. The distance between the Wii Remotes is

added to the x value of the points from the

right Wii Remote (the right Wii Remote will

be on the left when looking into the cameras

of the remotes). The units of the ﬁnal output

will be the same as the units used to measure

this distance. If the distance between the Wii

Remotes is unknown then 1 should be added

instead.

= X

+ d (9)

This step moves the points read by the right

camera into the correct coordinate system in

relation to the left camera.

Rays originating from the Wii Remotes passing

through these new left and right calculated points are

created. Figure 6 shows that the left camera is at the

origin (0,0,0) and the right camera is at (d,0,0) where

d is the distance between the Wii Remotes. A ray col-

lision algorithm is used to ﬁnd the location along the

rays where they are closest to colliding. This location

is the approximate position of the point. The exact lo-

cation of the collision can not be calculated because

the rays will not collide perfectly due to error in the

data captured by the Wii Remotes.

IVAPP 2011 - International Conference on Information Visualization Theory and Applications

226

Figure 5: Side View.

Figure 6: Finding the 3D point.

4.2 Multiple Points

The algorithm described above senses one point.

However, reading two points introduces ambiguity

when both points lie in the same y plane. The problem

occurs because the Wii Remotes transmit only the co-

ordinates of the infrared points. This means that there

are no surrounding visual aids to help distinguish be-

tween the two points. Because of this limited infor-

mation, it can be impossible to tell which point from

the second Wii Remote corresponds to the point seen

by the ﬁrst Wii Remote in certain situations.

When the points are not in the same y plane the ray

collision error can be used to match the correct points.

To distinguish between these points both possibilities

are tested and the pair with the smaller error is used.

This method does not work when the points are in the

same y plane because both pairs appear to be valid,

as shown in Figure 7. We leverage the knowledge

that the leftmost point on the ﬁrst camera should be

paired with the leftmost point on the second camera

in most situations. When we cannot distinguish the

points using the above methods, we use the fact that

the Wii Remotes return the points that they detect in

the same order throughout a session, and assume that

Figure 7: Lines intersecting incorrectly.

they have not changed. This can be a problem if a

camera stops sensing points and then ﬂips the order it

senses them in. In practice however, these issues with

multiple points remain largely unnoticed because the

points are constantly being updated.

5 GESTURES

Our implementation of gesture recognition is de-

signed to be easily expandable. This is accomplished

by using two levels of gestures: basic gestures and

composite gestures. Basic gestures are simple mo-

tions and movements of the user’s hands that are de-

tected within our stereo-vision algorithm. We created

a gesture interface in Java that can be implemented by

any class that needs to detect these basic gestures. The

second level of gestures is composite gestures. Com-

posite gestures are combinations of basic gestures that

can be used for more complex input. These higher-

level gestures are implemented by the class that uses

our interface and are more application speciﬁc than

basic gestures.

5.1 Basic Gestures

There are two categories of basic gestures imple-

mented at this time: pinch and swipe. A pinch gesture

(Figure 8(a)) is activated when the two points seen by

the cameras move close enough together that they ap-

pear to be one point. There is also an unpinch gesture

that is activated when a pinched point separates back

into two points. An unpinch gesture can only be de-

tected after a pinch has taken place, which keeps the

cameras from falsely identifying two unrelated points

as an unpinch gesture. The other basic gesture, swipe,

is activated when the points move a set distance in any

dimension (Figure 8(b)). The movement of either one

or two points is tracked depending on how many the

TOWARDS AFFORDABLE GESTURE BASED INTERFACES

- An Exploration with Wii Remotes

227

(a) Pinch. (b) Swipe. (c) Rotate.

Figure 8: Various Gestures.

cameras see. Tracking any number of points allows a

swipe to be detected regardless of the pinching state.

Swipes can be detected in both positive and negative

directions in all three dimensions for a total of six dif-

ferent swipes that can be detected and used for com-

posite gestures.

5.2 Composite Gestures

By combining the basic gestures discussed above and

the current position of the points, application speciﬁc

gestures can be created. These composite gestures can

be very simple, using just one basic gesture to acti-

vate some sort of onscreen movement, or as compli-

cated as necessary, making use of several gestures in

sequence. A rather conventional application would be

to map the movement of the points to cursor position

and the pinch/unpinch to a click. The gestures im-

plemented for Spiegel, described in detail in the fol-

lowing sections, provide an example of more complex

composite gestures.

6 NUWII AND SPIEGEL

The Spiegel framework was developed to visualize

large multidimensional astrophysical data. It is de-

signed according to the UNIX paradigm of pipes and

small utilities that do one thing and do it well (Bischof

et al., 2006). These small utilities are called ”func-

tions” by Spiegel developers. Once the Spiegel GUI is

loaded, the user chooses the functions that they want

to use from the menu and the functions appear on

screen as boxes with incoming and outgoing arrows.

For example, in order to display a set of simulation

data in 3D, the user would import ﬁve boxes - one

to import the data ﬁle from the ﬁle system, one that

extracts the stars from the data, one that converts the

star data into a format that Java3D can understand,

one that determines camera parameters, and ﬁnally a

display window for the image. Our team wrote two

new functions for Spiegel. One of these, named Wi-

imoteControl, connects the Wii Remotes to the com-

puter and interprets the data read from the Wii Re-

motes as camera coordinates. The other, 3DPointDis-

play, shows the points from the Wii Remotes in 3D

space and is used for debugging the system if prob-

lems arise.

6.1 Camera Control in Spiegel

Within Spiegel, three composite gestures were imple-

mented to control the camera position. These gestures

are used to enter different camera movement modes.

Each gesture starts with a pinch followed by a swipe.

The pinch sets a center point to be used as a reference

for movement in each camera movement mode. The

direction of the swipe determines which mode is acti-

vated. The user can easily exit each movement mode

by unpinching their ﬁngers. Using these gestures the

user can view the simulation from any angle.

The best way for the user to remain oriented when

viewing a simulation in Spiegel is to have the camera

constantly pointing towards the origin. The simplest

way to achieve this behavior is to use a spherical co-

ordinate system for the camera position. Changes in

the inclination and azimuth angles correspond to ver-

tical and horizontal rotation respectively and chang-

ing the radial distance acts as a zooming function (see

Figure 9). In NuWii, each camera movement mode

changes the value of one of the spherical coordinates.

The three camera movement modes are horizon-

tal rotation, vertical rotation, and zoom. Horizontal

rotation is activated by pinching and swiping to the

right. To rotate the model, the user rotates the pinched

point around the reference point to move the camera

around the model (see Figure 8(c)). Swiping back-

wards (towards the user) after pinching triggers the

vertical rotation mode. After swiping, the user can

move his hand up or down to rotate the camera verti-

cally around the model. Horizontally, the camera can

be rotated around the model indeﬁnitely, but vertical

rotation is capped at positive and negative 90 degrees.

This restriction keeps the user from moving the cam-

era over the model which would make the view up-

side-down. The zoom control is activated by pinch-

IVAPP 2011 - International Conference on Information Visualization Theory and Applications

228

ing and then swiping down. Once in the zoom mode,

the distance between the center point and the current

point is used to scale the zoom speed. If the current

point is at the center point set by the pinch, then the

camera will be stationary. When the current point is

in front of the center point, then the camera will zoom

in. Similarly, if the current point is behind the center

point, then the camera will zoom out.

Figure 9: Camera in relation to the scene.

7 CONCLUSIONS AND FUTURE

WORK

There are several possible directions for future work.

Currently, our software recognizes two basic gestures

and three composite gestures, which are identiﬁed us-

ing two points of IR input. Future contributors could

design and implement more gestures, which would

expand control over Spiegel signiﬁcantly. The expan-

sion of the gesture library could be aided by track-

ing more than two points at a time. This would re-

quire more advanced trigonometry, additional Wii Re-

motes, and/or different wavelengths of IR light. Cur-

rently the Wii Remote cameras must be placed in a

close approximation to the orientation speciﬁed by

the user in software in order for the gesture recog-

nition code to work correctly. A camera calibration

method could be written, allowing the Wii Remotes

to be placed at any angle and any distance apart. Re-

search is also needed in order to quantify the differ-

ences and advantages to using a 3D gesture system

over a traditional system.

In developing our system, we experimented with

different gestures and hand motions however more

formal investigations are necessary to determine

which gestures are more natural to our users. The

NuWii system is intended to be a starting point, where

more gestures can be added to the library so that more

end user applications can be supported. While our

gestures work acceptably for the described applica-

tions, other applications will have their own needs for

speciﬁc gestures and settings.

Further research into the way that the Wii Remote

connects to the computer via Bluetooth would also

be helpful, since we noted that other Bluetooth de-

vices occasionally caused interference. Additionally,

the error in sensing the points could be reduced by

the addition of another Wii Remote placed above the

ﬁrst two. This would help reduce the error in sensing

correct points as well as eliminating error when the

wrong points are matched.

In this paper we have introduced NuWii, a work-

ing gesture-based interface for the Spiegel visual-

ization framework. We have explained our track-

ing algorithm, and described the gestures that we

have implemented thus far. Our system is capable

of tracking gestures in 3D, our source code is avail-

able to the public under the GNU Public License at

nuwii.googlecode.com. The input device can be repli-

cated using less than $150 worth of hardware.

REFERENCES

Bak, A. (2004). 3d input devices for the grapecluster

project. Independent study report, Rochester Institute

of Technology.

Bischof, H.-P. (2010). The Spiegel Project. http://spiegel.cs.

rit.edu/ hpb/grapecluster/Spiegel/index.html.

Bischof, H.-P., Dale, E., and Peterson, T. (2006). Spiegel

- a visualization framework for large and small scale

systems. In Arabnia, H. R., editor, Proceedings of

the 2006 International Conference on Modeling, Sim-

ulation and Visualization Methods, pages 199–205.

CSREA Press.

Cuypers, T., Van den Eede, T., Ligot, S., and Francken, Y.

(2009). Stereowiision: Stereo vision with wiimotes.

In Proceedings of 3D Stereo MEDIA 2009.

Dehling, E. (2008). Using multiple wiimote cameras to con-

trol a game. Master’s thesis, University of Twente.

Hay, S., Newman, J., and Harle, R. (2008). Optical track-

ing using commodity hardware. In Proceedings of the

7th IEEE/ACM International Symposium on Mixed

and Augmented Reality, ISMAR ’08, pages 159–160,

Washington, DC, USA. IEEE Computer Society.

Lee, J. C. (2008). Hacking the nintendo wii remote. IEEE

Pervasive Computing, 7:39–45.

Underkofﬂer, J. (2010). John underkofﬂer points to the fu-

ture of ui. Talk at TED 2010.

Wang, D. and Huang, D. (2008). Low-cost motion captur-

ing using nintendo wii remote controllers. Csc2228

project report, University of Toronto, Toronto, Ontario

M5S 3G4.

Wang, R. Y. and Popovi

c, J. (2009). Real-time hand-

tracking with a color glove. In SIGGRAPH ’09: ACM

SIGGRAPH 2009 papers, pages 1–8, New York, NY,

USA. ACM.

Wii Brew, C. (2010). Wiimote. http://wiibrew.org/wiki/

Wiimote.

TOWARDS AFFORDABLE GESTURE BASED INTERFACES

- An Exploration with Wii Remotes

229