Perceptually Realistic Depiction of Human Figures
Martin Zavesky, Jan Wojdziak, Kerstin Kusch, Daniel Wuttig, Ingmar S. Franke and Rainer Groh
Institute of Software and Multimedia Technology, Department of Computer Science
Technische Universit
at Dresden, Dresden, Germany
Multi-perspective, Spatial perception, Non-photorealistic rendering, Perceptually-based rendering.
Projection of three-dimensional space onto a two-dimensional surface relies on the computer graphics camera
based in design on the camera obscura. Geometrical limitations of this model lead to perspective distortions
in wide-angle projections. Including the camera model, our approach is to involve the human perception
in order to create a realistic spatial impression by a two-dimensional image. The aim is to provide human-
centered interfaces for an efficient and coherent communication of spatial information in virtual worlds to
support avatar-mediated interaction with its need for correct depiction of human figures concerning proportion
and orientation. To this end, we explain an object-based and introduce a camera-based computer graphics
procedure to prevent projective distortions and misalignments.
Representations of individuals have become an essen-
tial feature in a wide range of applications. Users rep-
resented by their avatars are connected by broadband
networks in virtual environments primarily in the
field of entertainment such as three-dimensional com-
munication tools and massively multiplayer online
role-playing game (MMORPG). Avatars are graphi-
cal equivalents of users (Capin et al., 1999). They
are realistic-looking antropomorphic shapes that pose
communicational and social intentions of the users
(Fu et al., 2008). Avatars do not only have to be real-
istic due to the appearance of the user but also due to
his behavior (Bailenson et al., 2006).
Our consideration is focused on providing inter-
faces that assist the user in perceiving spatial informa-
tion as effectively and efficiently as possible (Jokela
et al., 2003). A necessary requirement to achieve
this aim is a perceptually realistic projection of three-
dimensional scenes. Perceptual realism means an in-
tegration of characteristics of human perception into
the imaging process and the image (Groh et al., 2006).
A common way of projecting objects on an image
plane is the perspective projection using the computer
graphics camera. In this model the observer and its
perception are disregarded. A projection that rather
relies on geometrical principles than on visual per-
ception might prevent ambiguity and misinterpreta-
tion (Franke et al., 2008).
The computer graphics camera model maps virtual
three-dimensional space onto a two-dimensional sur-
face. The transformation of virtual space onto the
image plane of the camera follows the laws of per-
spective projection (Foley, 1999; Angel, 1997). The
computer graphics camera model (figure 1) contains a
view frustum as well as the center of projection [C],
the viewing direction [V] and the up-vector [U]. The
view frustum is defined by the aspect ratio of the pro-
jection plane and also near [N] and far [F] clipping
plane. The intersection of the optical axis (viewing
direction) and the image plane is the principle vanish-
ing point of the image and corresponds to the intersec-
tion of the horizon and the sagittal line in the image
(Franke et al., 2006).
The projection yields a perspective view along the
viewing direction of the camera and obtains an uni-
form treatment of all objects in the image. As a di-
rect consequence, objects located near the edges of
the view frustum and close to the image plane are
streched and misaligned. Especially, if objects are
covered by a closed curved (convex) surface and are
Zavesky M., Wojdziak J., Kusch K., Wuttig D., S. Franke I. and Groh R..
AN INDIVIDUAL PERSPECTIVE - Perceptually Realistic Depiction of Human Figures.
DOI: 10.5220/0003317103130319
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2011), pages 313-319
ISBN: 978-989-8425-47-8
2011 SCITEPRESS (Science and Technology Publications, Lda.)
[N] [F]
Figure 1: The common computer graphics camera model
consists of center of projection [C], viewing direction [V]
and up-vector [U] and also view frustum with near clipping
plane [N] and far clipping plane [F].
projected with a wide aperture angle, this stretching
is perceived as unnatural and distorted (Franke et al.,
2008; Yankova and Franke, 2008). This effect can be
clearly observed at spheres, columns or human bod-
ies as shown in figure 4 a (Groh, 2005; Ware, 1900).
In addition, these shown human figures seem to pos-
sess an orientation which does not correspond with
the expectation of the observer. The issues arise from
the limitation of the standard camera model. There is
only one single mathematical center of projection de-
fined and therefore only one principal vanishing point
exists. Accordingly, computer generated images are
typically mono-perspective.
The depiction of human figures has a long tradition
in fine arts. Especially the artists of the Renaissance
exerted themselves to analyze the proportions and ori-
entations of human shapes in order to create rules for
an aesthetic depiction.
Analyzing Renaissance paintings reveals that
artists were faced with the same challenges as current
computer graphics: to map a three-dimensional scene
realistically onto a two-dimensional plane (Groh,
2005; Hockney, 2006; Ware, 1900). But there is
a difference between paintings and computer gener-
ated images: The ”device” for creating projections. A
computer system calculates an image on geometrical
rules of projection. Artists integrate a “human factor”
in these rules because they construct paintings on their
own (human) visual impression (Yankova and Franke,
2008). For analyzing the “human factor” and its ap-
plicability in computer generated images, the picture
“The Tribute Money” (figure 2) will be discussed.
As seen in figure 3, the scene is constructed by the
rules of perspective projection. The investigation of
the perspective structure reveals that the construction
design of the building in the right part of the paint-
ing is developed dependent on the principal vanishing
point [P
] of the whole scene. The shapes of all fig-
ures appear undistorted whereby figures proportions
Figure 2: The Tribute Money, Tommaso di Ser Cassai
(Masaccio), 1425-1428.
] [P
Figure 3: Sketch of ’The Tribute Money’, Tommaso di Ser
Cassai (Masaccio) with visualized horizontal [H] and sagit-
tal [S] line as well as the principle vanishing point [P
] and
the additional principle vanishing points [P
] and [P
] of
persons [A] and [B].
are kept constant. Each figure is shown in an in-
dividual perspective with a corresponding additional
principal vanishing point (Franke et al., 2007). As
an example the additional principle vanishing points
] and [P
] of persons [A] and [B] are shown in
figure 3. In contrast to images rendered by the com-
puter graphics camera, their proportions are regular
regardless of their distance to the principal vanishing
point [P
] as a result of the multi-perspective visual-
ization (Groh et al., 2006; Hockney, 2006). A virtual
reconstruction with an angle of view of the computer
graphics camera of 120 degrees, which is compara-
ble to the painting “The Tribute Money” of Masaccio
illustrates the differences between mono-perspective
and multi-perspective images. To ensure the compa-
rability of the human shapes, all figures base upon the
same three-dimensional model. Especially the lateral
figures [A] and [B] in figure 4 a are affected by distor-
tions, that result from the projection process.
Hence, an essential feature of Renaissance fine
arts is the use of multi-perspective projection meth-
ods in paintings to depict perceptually realistic human
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
[A] [B][C][D]
Figure 4: Virtual reconstruction of “The Tribute Money”
(a ) rendered with standard computer graphics camera (b )
using OBPC on the colored human figures (c ) using CBPC
on the colored human figures.
The last section emphasized multi-perspective as a so-
lution to create perceptually realistic images. The ap-
plicability in computer graphics implies a modifica-
tion of the standard computer graphics camera model
because a single viewing transformation cannot ful-
fill the requirements of multi-perspective imaging (cp.
2). Accordingly, individual objects of a computer
graphics scene have to be visualized in their individ-
ual perspective.
4.1 Related Work
In multi-perspective images, the spatiality of a three-
dimensional scene should be preserved concerning
the proportions of objects. Spherical objects have
to be visualized circularly and lines of cubic objects
should be straight in multi-perspective images. The
approach of (Zorin and Barr, 1995) enables to re-
duce geometric distortions in computer-generated and
photographic images by constructing viewing trans-
formations in a post processing stage of the render-
ing process. However, this procedure does not enable
to visualize single objects in its individual perspec-
tive. Manipulations on the image plane are used by
(Carroll et al., 2009) to preserve shapes and maintain
straight lines of a scene that are marked by the user.
The approach has to be supported manually and can-
not be used in virtual worlds. (Zelnik-Manor et al.,
2005) developed a procedure to construct panoramas
from images taken from a single view point with no
noticeable distortions in background and objects, but
it is not applicable on discret objects of a scene. An-
other approach to achieve multi-perspective is to ex-
tend the camera model by using a multi-projection
rendering algorithm with multiple cameras (Singh,
2002; Yu and McMillan, 2004). The final image is
a composition of these camera views. The main fo-
cus of the procedures of (Singh, 2002) and (Yu and
McMillan, 2004) are multi-perspetive panoramas and
artistic multi-perspective rendering and not percep-
tually realistic imaging. (Agrawala et al., 2000) de-
veloped a framework for rendering multi-perspective
images from three-dimensional models based on spa-
tially varying projections. This approach is, similar to
(Carroll et al., 2009), not completely self-acting. The
concept of (Coleman and Singh, 2004) also operates
with different cameras like the previous one. In con-
trast, the image is rendered by a single boss camera.
The other cameras also called lackeys define a defor-
mation on the scene objects. That approach addresses
spatial scene coherence, shadows and illumination but
no self-contained objects. We show in (Franke et al.,
2007) one approach to visualize objects in its own per-
spective. A direct manipulation of the geometry is
used to visualize objects in its individual perspective
with the standard computer graphics camera model.
4.2 Object-based Perspective
Correction (OBPC)
In (Franke et al., 2007) we presented an algorith-
mic solution implementing our concept of multi-
perspective imaging. The approach belongs to the
class of geometrical manipulations and has been al-
ready applied on abstract objects such as spheres and
columns. In the following, the algorithm will be used
on anthropomorphic shapes in the context of commu-
nication in virtual worlds. It will be referred to as
Object-Based Perspective Correction to distinguish
this algorithm from the Camera-Based Perspective
Correction which will be described in the next sub-
section. The object-based correction algorithm per-
forms affine transformations to modify the geometry
of objects directly. These objects (avatars) are visu-
alized in a way that their anthropomorphic shapes ap-
pear to be undistorted on the image plane. This proce-
AN INDIVIDUAL PERSPECTIVE - Perceptually Realistic Depiction of Human Figures
dure allows to influence only those objects selectively
affected by a perspective distortion. A significant ad-
vantage of this approach is, that the standard com-
puter graphics camera model can be used to create
images. It is not necessary to render multiple views as
it is done in image based solutions (Agrawala et al.,
2000; Zorin and Barr, 1995). To imitate the multi-
perspective approach of painters, transformations of
rotation and shear are applied to the object geome-
try [O
] (cp. figure 5). This results in a modifica-
tion of the object on the image plane (cp. [D
], [D
and [D
]). It is a pre-rendering concept depending
on the spatial constellation of camera and object that
counteracts the perspective distortions. Consequently,
the camera system requires only one center of pro-
jection [C] and one viewing direction [V] for multi-
perspective imaging. The main steps of the algorithm
are summed up as follows (cp. (Franke et al., 2007)):
1. Specify the pivot point of the object in the local
camera coordinate system.
2. Compute the shear factors from that relative posi-
3. Compute the rotation angles
4. Rotate the object around x- and y-axis according
to the rotation angles
5. Shear the object with a shear matrix based on the
computed shear factors
] [O
] [D
Figure 5: Transformations of rotation and shear by the
OBPC. The depiction [D
] of object [O
] has the same size
on the image plane [I] as the depiction [D
] of object [O
on the viewing direction [V] originating from the center of
projection [C] while the depiction [D
] of object [O
] cov-
ers a bigger region on the image plane.
A multi-perspective image of the reference scene
based on the OBPC is shown in figure 4 b. The col-
ored human figures were treated by the mentioned al-
gorithm. The visualized peripheral avatars are undis-
torted and aligned correctly caused by their individual
additional principle vanishing points. The resulting
image “imitates” the perceived outcome of the natu-
ral human viewing behaviour. Correcting single ob-
jects embeds further perspectives in a original mono-
perspective scene view.
4.3 Camera-based Perspective
Correction (CBPC)
The rendering process allows the manipulation of
the projection plane as another method to generate
perception-adapted images. The aim is to offer an al-
ternative procedure to generate perceptually realistic
images based on multi-perspective. This is realized
by the usage of several cameras combined according
to predefined rules. Multiple cameras at the same po-
sition with different orientations simulate the succes-
sive shift of human visual attention. The algorithm
creates a camera-framework consisting of a system
camera and several object cameras as shown in fig-
ure 6.
] [O
Figure 6: Principle of the CBPC. The system camera ren-
ders the whole scene ([O
], [O
]) excluding the object [O
attached to the object camera. The viewing direction [V
of the object camera is aligned towards its associated object.
For size constancy the image plane [I
] of the object cam-
era is shifted towards the object right up to the intersection
[S] of viewing direction [V
] and image plane of the system
camera [I
For the creation of images attuned to visual per-
ception it is necessary to synchronize the proper-
ties of the object cameras and the system camera.
While translating the system camera in the three-
dimensional scene the position of the object cameras
are shifted corresponding to the position of the sys-
tem camera. Contrary to this, the viewing direction of
the object camera stays aligned to its assigned object.
The final image is composed of the rendered frames of
each camera ordered by the scene depth of the corre-
sponding objects. As shown in figure 7, the image of
the object camera is placed on the final image plane at
the position it would have on the image plane of the
system camera. Due to the different viewing direc-
tions the distance between the object and the image
plane of the object camera is larger than the distance
of the object and the image plane of the system cam-
era. Therefore, the projected object covers a smaller
region on the image plane of the object camera than
on the respective plane of the system camera. To pre-
serve the real size of the imaged object the projection
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
image of object camera image of system cameraresulting image
Figure 7: Composition of the image planes of system and object cameras.
plane of the attached camera is shifted towards the
object right up to the intersection [S] of the optical
axis and image plane of the system camera [I
] (see
figure 6). While translating and rotating the system
camera, every object camera must be updated related
to the position of the system camera in virtual space,
its viewing direction and its order and position of the
rendered frame on the final image. The main steps of
the algorithm are summed up as follows:
1. Create an object camera and assign an object
2. Align viewing direction of the object camera to
the pivot point of the object
3. Translate image plane towards the object
4. Calculate original position of the object on system
image plane in screen coordinate system
5. Translate the image to the position on the final im-
6. Sort image planes by depth
A result of this multi-perspective rendering pro-
cess is visualized in figure 4 c. Each colored human
shape is rendered with its individual object camera
while the scene itself is visualized by the system cam-
era. The proportions of the human shapes are visu-
alized in correct relations. Accordingly, these com-
puter generated images are multi-perspective in order
to create a perceptually realistic spatial impression.
4.4 Comparison of Approaches
The conjoint aim of both procedures is providing
human-centered interfaces for an efficient and co-
herent communication of spatial information (Jokela
et al., 2003). With two ways for creating perceptu-
ally realistic multi-perspective images the purpose of
reducing perspective distortion can be achieved. The
proportions and the orientations of human figures are
preserved (see figure 4 b and 4 c). We conducted a
study to identify the benefit of multi-perspective im-
ages. The focus was set on the difference between the
original orientation of a figure in three-dimensional
scenes and its orientation perceived by the viewer.
no object
no object
Figure 8: Differences between Perspective Correction pro-
b c
d e f
Figure 9: Details of virtual reconstruction of “The Tribute
Money” (a+d ) rendered with standard computer graphics
camera b+e ) using OBPC (c+f ) using CBPC.
First results show a divergence. This difference de-
pends on the figures original position and orientation.
Figure 10 shows a lateral located figure which is ro-
tated by -30 degrees. The perceived orientation in
a mono-perspective image differs significantly from
the orientation of the figure in the scene. In contrast,
the perceived orientation in a multi-perspective image
adapts the factual orientation approximately.
The following section discusses each Perspec-
tive Correction procedure related to the their appli-
cation in interface design. To investigate both proce-
dures in a realtime application the algorithms were
realized in our framework for experimental three-
dimensional computer graphics, called Bildsprache
AN INDIVIDUAL PERSPECTIVE - Perceptually Realistic Depiction of Human Figures
Figure 10: Divergence in orientation perception of figures
in mono- as well as multi-perspective images from the ori-
entation of the figure in the three-dimensional scene.
LiveLab, based on C++ and OpenGL. The OBPC al-
gorithm uses an unmodified camera to create multi-
perspective images. The geometric transformations
are executed solely on scene objects. The applicabil-
ity is also ensured in realtime applications. Another
advantage is the possibility of an unrestricted geomet-
rical transformation of objects. It is possible to adjust
the algorithm steps shear and rotation independently.
The adjustability of this transformation steps can be
useful for adapting the proportion and orientation of
avatars in respect of the context. For example some
scene configurations require to counteract unwanted
intersections of objects. This is illustrated by the right
human figure’s leg [B] that intersects the stair in fig-
ure 9 e.
A further aspect is the orientation of a human fig-
ure towards the image plane and other human fig-
ures. Examining the left person [C] in the original
painting (figure 2) compared to its perspective recon-
struction (figure 4 a) reveals that beside the distortions
the orientation is falsified by the perspective projec-
tion. A second analysis of figure 4 b (in detail 9 b)
shows that the adjustability of the OBPC allows a
viewer-orientated depiction of person [C] just like in
the painting by Masaccio (see figure 2).
In contrast, the CBPC is more restrictive; its pa-
rameters are less flexible. The functional dependency
on the camera system only permits an absolute adjust-
ment. As a constraint, this procedure is not sensitive
to the context. An adjustable manipulation of propor-
tion and orientation of human figures is not possible.
Nevertheless, this method enables to correct the pro-
portions of human shapes in a perceptually realistic
manner (Yankova and Franke, 2008).
Compared with the object-based approach the
camera-based algorithm causes considerably less ob-
ject intersection. The reason is a geometrically un-
modified scene. For example, there is no intersec-
tion between the leg of person [B] and the stair in
the scene (cp. figure 9 f). This is achieved by arrang-
ing several image layers. The differences between the
perspective correction procedures are summarized in
figure 8. The comparison shows that both algorithms
are appropriate for creating perceptually realistic im-
ages similar to Renaissance painting like “The Tribute
Money”. Both procedures complement each other in
essential parts. They can be used on different objects
in virtual scenes simultaneously without any interfer-
In the present paper computer graphics procedures
to generate perceptually realistic images of three-
dimensional scenes were presented and compared.
Perceptual realism is achieved by creating multi-
perspective images based on the rules of perspective
projection enhanced by characteristics of visual per-
ception and techniques of Renaissance painting.
The scenario of visualization of humanoid figures
in which an appropriate depiction of proportions and
orientation are of particular importance was chosen
to underline the benefits of perceptually realistic im-
ages. Multi-perspective in the image was identified
to emulate the human viewing behavior. A compari-
son of Renaissance painting and computer generated
images revealed a insufficiency in depiction of hu-
manoid figures in computer graphics. Subsequently,
two methods to create multi-perspective images were
introduced: an Object-Based and a Camera-Based
Perspective Correction. Although these procedures
differ in their way of proceeding to solve distortion
and misalignment. They can be used in virtual worlds
simultaneously without any interference to obtain a
more perceptually realistic result.
The general concept of Perspective Correction
(Object- and Camera-Based) can enhance all kinds of
visualization systems displaying human shapes and
closed curved surfaces in general. It is suitable for
contexts that require a wide visual range and a high
amount of comparability of all objects simultane-
ously. This is found, for example, in ergonomic sim-
ulations, video games or in virtual training simula-
tions. Yet, the context of the scene is not considered
automatically. Our next step will be the adaptation of
the algorithms based on the scene. This can be made
by the semi-automatic adjustment of the parameters
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
of the OBPC. This may conduce to perceptually re-
alistic visualized objects which are tightly fitting in
the context of the scene. For the CBPC it is conceiv-
able to shift the object cameras. Thereby, a customiz-
ability similar to the object-based procedure can be
achieved. As a result, the camera-based procedure
would be sensitive to the context.
Jan Wojdziak and Martin Zavesky thank European
Social Fund (ESF) / the European Union and the Free
State of Saxony. Rainer Groh, Ingmar S. Franke and
Kerstin Kusch thank Deutsche Forschungsgemein-
schaft (DFG) Wahrnehmungsrealistische Projektion
von dreidimensionalen Szenen (WaRP, DFG-GZ:GR
Agrawala, M., Zorin, D., and Munzner, T. (2000). Artistic
multiprojection rendering. In Proceedings of the Eu-
rographics Workshop on Rendering Techniques 2000,
pages 125–136.
Angel, E. (1997). Interactive Computer Graphics - A top-
down approach with OpenGL. Addison-Wesley.
Bailenson, J. N., Yee, N., Merget, D., and Schroeder, R.
(2006). The effect of behavioral realism and form re-
alism of Real-Time avatar faces on verbal disclosure,
nonverbal disclosure, emotion recognition, and cop-
resence in dyadic interaction. Presence: Teleopera-
tors & Virtual Environments, 15(4):359–372.
Capin, T., Pandzic, L., Thalmann, N. M., and Thalmann, D.
(1999). Avatars in Networked Virtual Environments.
John Wiley and Sons Ltd.
Carroll, R., Agrawala, M., and Agarwala, A. (2009). Opti-
mizing content-preserving projections for wide-angle
images. In SIGGRAPH’09:ACM SIGGRAPH 2009
papers, pages 1–9.
Coleman, P. and Singh, K. (2004). Ryan: rendering your
animation nonlinearly projected. In NPAR ’04: Pro-
ceedings of the 3rd international symposium on Non-
photorealistic animation and rendering, pages 129–
Foley, J. D. (1999). Computer Graphics-Principles and
Practice. Addison-Wesley.
Franke, I. S., Pannasch, S., Helmert, J. R., Rieger, R.,
Groh, R., and Velichkovsky, B. M. (2008). Towards
attention-centered interfaces: An aesthetic evaluation
of perspective with eye tracking. ACM Trans. Multi-
media Comput. Commun. Appl., 4:1–13.
Franke, I. S., Zavesky, M., and Dachselt, R. (2007).
Learning from painting: Perspective-dependent ge-
ometry deformation for perceptual realism. In 13th
Eurographics Symposium on Virtual Environments
Franke, I. S., Zavesky, M., and Rieger, R. (2006). The
power of frustum - die macht der geometrischen mitte.
In NMI 2006 “Film, Fernsehen und Computer” Neue
Medien der Informationsgesellschaft.
Fu, Y., Li, R., Huang, T. S., and Danielsen, M. (2008).
Real time multimodal human avatar interaction. IEEE
Trans. CSVT, 18(4):467–477.
Groh, R. (2005). Das Interaktions-Bild - Theorie und
Methodik der Interfacegestaltung. TUDpress Verlag
der Wissenschaft.
Groh, R., Franke, I. S., and Zavesky, M. (2006). With a
painter’s eye: An approach to an intelligent camera.
In Proceedings of The Virtual 2006.
Hockney, D. (2006). Secret Knowledge:Rediscovering the
Lost Techniques of the Old Masters. Thames & Hud-
son Ltd.
Jokela, T., Irvari, N., Matero, J., and Karukka, M. (2003).
The standard of user-centered design and the standard
definition of usability: Analyzing iso 13407 against
iso 9241-11. In CLIHC 03: Proceedings of the Latin
American Conference on Human- Computer Interac-
tion, pages 53–60.
Singh, K. (2002). A fresh perspective. In Proceedings of
Graphics Interface 2002, pages 17–24.
Ware, W. R. (1900). Modern perspective: A treatise upon
the principles and practice on plane and cylindrical
perspective. MacMillian and Co.
Yankova, A. and Franke, I. S. (2008). Angle of view vs.
perspective distortion: a psychological evaluation of
perspective projection for achieving perceptual real-
ism in computer graphics. In APGV’08:Proceedings
of the 5th symposium on Applied perception in graph-
ics and visualization, pages 204–204.
Yu, J. and McMillan, L. (2004). A Framework for Multiper-
spective Rendering. In Rendering Techniques 2004,
Proceedings of Eurographics Symposium on Render-
ing, pages 61–68, 408.
Zelnik-Manor, L., Peters, G., and Perona, P. (2005). Squar-
ing the circles in panoramas. In Tenth IEEE Inter-
national Conference on Computer Vision (ICCV’05)
Volume 1, pages 1292–1299.
Zorin, D. and Barr, A. H. (1995). Correction of geomet-
ric perceptual distortions in pictures. In SIGGRAPH
’95: Proceedings of the 22nd annual conference on
Computer graphics and interactive techniques, pages
AN INDIVIDUAL PERSPECTIVE - Perceptually Realistic Depiction of Human Figures