Beyond the Ragdoll: Designing a Safe-Falling Controller

for Physically Simulated Characters

Lovro Boban

and Mirko Su

znjevi

University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia

Keywords:

Reinforcement Learning, Computer Animation, User Study.

Abstract:

In this paper we propose the design for a procedural, physically simulated animation system that produces

safe falling animations using reinforcement learning. A character controller is trained to minimize external

and internal forces on the humanoid character’s body after it is pushed backwards on a ﬂat surface. We design

a questionnaire and conduct a user study that compares the reinforcement learning approach with a motion

capture approach using subjective ratings on various aspects of the character’s movement. Our ﬁndings, based

on a sample size of (n = 25), indicate that users prefer the motion capture approach on 6 out of 8 aspects, but

prefer the reinforcement learning approach on the aspect of reactivity.

1 INTRODUCTION

Character animation is an integral part of many video

games, and despite growing visual ﬁdelity over the

years, one area of animation is still not very devel-

oped - interactive, procedural animation. While many

video games do use minor procedural elements in

their animations such as inverse kinematics for correct

foot placement on uneven surfaces and cloth simula-

tion for secondary motion, rarely do games use pro-

cedural animation for whole body motions. This can

result in static animations that do not meaningfully

adapt to the environment, thereby creating a discon-

nect between the player character and their environ-

ment, lowering immersion.

Procedural, physically simulated animations,

however, do show up with some regularity in many

video games in the form of ”ragdolls”. When a char-

acter trips and falls or is pushed, their body goes

limp and they fall to the ground, often without even

trying to protect themselves from injury during the

fall. While some games do have so-called ”active rag-

dolls” which curl up to protect themselves or try to re-

tain balance before falling over, this area of animation

is still underexplored.

In this paper we present the design of a procedural

safe falling animation system based on reinforcement

learning (RL) and we present the results of a study

https://orcid.org/0009-0003-4225-7146

https://orcid.org/0000-0003-0334-8179

that compares the system with a more traditional an-

imation system based on motion capture. Section 2

looks at other works that have contributed to the topic

of safe falling or producing animations using machine

learning. Section 3 describes the design of our anima-

tion system. Section 4 describes the design and the

results of the comparative user study, including a dis-

cussion of the results. Finally, section 6 summarizes

the results and talks about where future work might

be headed.

2 RELATED WORK

2.1 Safe Falling in Robotics

One discipline of engineering that is concerned with

the execution of safe falling strategies is robotics.

Aside from sensing the environment and navigating

through it, safe falling methods for robots are of great

interest to roboticists because an uncontrolled fall

could result in serious damages to the robot, its en-

vironment, or living beings around the robot.

(Rossini et al., 2019) developed a method to com-

pute the optimal falling trajectory for a robot to max-

imally reduce the velocity at points where it impacts

the ground by squatting and stretching as it falls. The

fall is modeled as a series of telescopic inverted pen-

dulums - the standing robot is modeled as a four-link

inverted pendulum, then as a three-link inverted pen-

Boban, L. and Sužnjevi

c, M.

Beyond the Ragdoll: Designing a Safe-Falling Controller for Physically Simulated Characters.

DOI: 10.5220/0013259200003912

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2025) - Volume 1: GRAPP, HUCAPP

and IVAPP, pages 323-330

ISBN: 978-989-758-728-3; ISSN: 2184-4321

323

dulum with a different pivot point when it drops to

its knees and so forth. This method enables the robot

to drop both forwards and backwards from a standing

position. (Ha and Liu, 2015) take a similar approach

by modeling the fall of the robot as a series of inverted

pendulums. Multiple contact points with the ground

are planned such that they maximally distribute the

kinetic energy of the fall across each point instead of

one contact point taking the brunt of the fall. This ap-

proach can manage more dynamic scenarios and can

execute a rolling strategy if the original velocity of the

robot is so high as to require it.

Whereas the aforementioned approaches seek to

minimize damage to the robot itself, (Yun et al., 2009)

take a different approach by steering the robot during

its fall in such a way as to avoid falling on nearby ob-

jects and damaging them. The relation between the

robots center of mass and its base support polygon

determines the balance of the robot and the direction

of its fall. The robot can then change its fall direction

by altering the base support polygon by stepping in-

telligently so that the leading edge of the polygon is

oriented away from the object that it wishes to avoid.

While the previously described safe falling meth-

ods dealt with low velocity falls from a standing posi-

tion, (Ha et al., 2012) have developed a real-time con-

troller that allows a physically simulated character to

land safely from a variety of heights and initial ve-

locities by rolling upon contact with the ground. The

controller ﬁrst plans a landing strategy and how it will

roll once on the ground depending on the initial veloc-

ity and rotation. It then continually changes its pose in

mid-air to manipulate its moment of inertia in order to

arrive in the position calculated at the beginning and

executes a safety roll or a dive roll to better disperse

the kinetic energy of the fall. Since the calculations

are done in real time the controller is robust against

external perturbations and is able to land in the de-

sired pose even after unexpected mid-air collisions.

2.2 Machine Learning Approaches to

Animation

Machine learning has found many uses in the broad

discipline of animation. One example would be pose

estimation which is a computer vision task that aims

to estimate the position and orientation of a person or

object. Pose estimation can help speed up the process

of animating from a video reference by obviating the

need to manually match the poses between the ani-

mated character and the video.

A different approach, that aims to produce the

animation itself, is to train the character controllers

of physically simulated characters using a group of

machine learning techniques known as reinforcement

learning (RL). In contrast to traditional supervised

machine learning approaches where the goal is usu-

ally classiﬁcation and where the model is trained on

an existing dataset with labeled examples, reinforce-

ment learning models do not use datasets but rather

interact with their environment to gain experience

and learn a behavior policy that aims to maximize

some reward function over time. An example would

be an RL model for playing video games where the

model is let loose inside an instance of a game and

gradually learns to reach an increasingly higher score

through trial and error. Some subﬁelds of reinforce-

ment learning are deep reinforcement learning, where

RL is combined with deep learning (suitable for very

large inputs like the entire rendered image of a video

game), and imitation learning, where instead of a re-

ward function, the learning is done by following ex-

ample actions demonstrated by an expert with some

techniques even having access to the expert at train-

ing time meaning that they can query the expert for

more examples if necessary.

One such example of an RL agent exploring pos-

sible actions in its environment can be found in the

work of (Yin et al., 2021) in which they trained a

character controller to discover various high jump

techniques via deep reinforcement learning. For the

run-up, they trained a separate controller to imitate

motion capture data of a run-up while also ending

up in the desired takeoff state. Since the aim is to

explore diverse jumping techniques, they varied the

takeoff state using a novel Bayesian diversity search

approach. After the takeoff and once in the air, nat-

ural looking poses were achieved by a Pose Varia-

tional Autoencoder with natural body poses extracted

from a motion capture database. To ensure that the

task of jumping over the bar wasn’t too hard to learn,

curriculum learning (Narvekar et al., 2020) was em-

ployed wherein the height of the bar, and thereby the

challenge, was gradually increased as the policy im-

proved, similarly to how people learn in reality.

(Yu et al., 2019) designed a character controller

for a physically simulated ﬁgure skater that can per-

form various tricks. They used video footage of ﬁg-

ure skating to extract key poses using pose estima-

tion and applied trajectory optimization to construct a

trajectory that visits all poses. Using deep reinforce-

ment learning they generated a robust controller that

can perform various ﬁgure skating skills. In their sub-

sequent work (Yu et al., 2021), the authors used 3D

pose estimation and contact estimation on monocu-

lar video footage of highly dynamic parkour motions.

For each frame of video, they extracted the 3D pose

of the athlete and used a contact estimator to esti-

GRAPP 2025 - 20th International Conference on Computer Graphics Theory and Applications

324

mate whether the feet and hands were in contact with

the environment. Combining pose and contact data,

they were able to reconstruct the full motion trajec-

tory from the video of a single panning camera. Using

deep reinforcement learning they coached a motion

controller inside a physical simulation to replicate the

exact movements extracted from the original video.

All of the aforementioned ML approaches do not

generate animations in real time, making them unsuit-

able for interactive scenarios. Luo et al. (Luo et al.,

2020) use imitation learning to train a directable, real

time motion controller to imitate reference animations

of various gaits of a quadruped. A generative ad-

versarial network is then used to train high level di-

rectives like speed and direction of movement and,

ﬁnally, deep reinforcement learning is employed to

train the controller to be more resistant to external per-

turbations. Tessler et al. (Tessler et al., 2023) take a

similar approach by using motion capture data to train

a humanoid motion controller to mimic it, but instead

of training directly on the data they train on its statis-

tical distribution thereby allowing for natural looking

variations.

Reinforcement learning methods have also been

applied to the problem of safe falling covered in sec-

tion 2.1 such as in the work of (Kumar et al., 2017).

An actor-critic architecture was used to develop a con-

troller which minimizes the maximal impulse during

the robot’s fall with each contacting limb of the robot

having an actor-critic pair assigned to it. The discrete

problem of contact planning is solved by determining

the limb whose corresponding critic has the highest

value function at a given point in time and having it

be the one that will contact the ground next. The con-

tinuous control problem is solved via the optimization

of the actors.

3 ANIMATION SYSTEM

Our safe falling animation system utilizes a machine

learning approach wherein a reinforcement learning

agent learns to control a fully physically simulated

humanoid character by controlling the torque gener-

ated at each of the character’s joints. This places our

approach into the category of procedural animation

since there exists no predeﬁned set of static anima-

tions. Instead, our system dynamically produces ani-

mations in real time by physically simulating the mo-

tion of the character.

We choose to use reinforcement learning because,

to our knowledge, there doesn’t exist a comprehen-

sive database of realistic falling animations that could

be used for training a controller by using imitation

learning. Furthermore, the few examples of motion

capture falling animations we could ﬁnd have exag-

gerated motions that aren’t naturalistic.

However, to help with convergence we also em-

ploy curriculum learning where we ﬁrst present the

agent with an easier version of the problem (in our

case that’s a less strict reward function) after which

we gradually increase the difﬁculty once the agent

surpasses a certain predeﬁned reward threshold or if

enough steps have passed. This helps the agent to

quickly converge to an approximate solution in the

ﬁrst step of the curriculum, after which the solution is

reﬁned through smaller tweaks in the following steps

of the curriculum. This can be seen on Figure 1 where

the reward for the approach without curriculum learn-

ing returns a lower reward at the end despite both ap-

proaches having an equally strict reward function at

that point.

-3000

-2000

-1000

Figure 1: Learning curve with and without curriculum

learning.

We use the MuJoCo

physics engine for physical

simulation and the PPO algorithm with default pa-

rameters to train the controller. Using the Gymna-

sium

API, we design a custom environment for the

RL agent to train in. Since Gymnasium already has

the Humanoid environment with a deﬁned humanoid

character, we choose to adapt it for our needs.

3.1 Custom Learning Environment

We adapt the already existing Humanoid environment

that comes with Gymnasium, which is itself based on

the environment introduced by (Tassa et al., 2012).

Just like the original environment, our environment

consists only of the humanoid character and the ﬂoor

that it stands on, although we have changed the char-

acter model by adding a neck joint which enables the

character to tuck their head in.

https://mujoco.org/

https://gymnasium.farama.org/index.html

Beyond the Ragdoll: Designing a Safe-Falling Controller for Physically Simulated Characters

325

At the beginning of each training episode, the

character is pushed backwards in order to trigger a

fall. To ensure that the controller learns to handle dif-

ferent push forces, the strength of the initial push is

randomly chosen from a predeﬁned range. If the char-

acter hits its head strongly enough, the episode termi-

nates prematurely and a big negative reward is given.

The reward function is structured thusly:

reward = healthy − head hit − ctrl − constr (1)

with the components having the following meaning:

• healthy - healthy reward, a constant positive re-

ward given for each elapsed training step.

• head hit - a ﬁxed negative reward given upon a

strong enough head hit. Prematurely terminates

the episode.

• ctrl - a negative reward proportional to the square

of the agent’s control signals.

• constraint cost - a negative reward proportional

to the constraint forces acting on each part of the

character’s body. This represents outside forces

acting on the body (analogous to impact forces)

as well as forces that keep the joints together

(roughly analogous to forces going through the

tendons and ligaments). It’s only counted for

those forces that go above a certain threshold.

As mentioned earlier, the curriculum changes the

reward function to be more forgiving at earlier steps

by increasing the threshold above which constraint

forces will be counted in constraint cost. Addition-

ally, we found that disabling head hit cost in the ﬁrst

step of the curriculum helps to produce a more natu-

ral looking results where the character doesn’t contort

itself into unnatural shapes in order to avoid hitting its

head.

3.2 Resulting Character Controller

As can be seen in the accompanying video

, the

trained character controller exhibits different behav-

iors depending on the strength of the initial push.

At weaker and medium push strength, the character

pushes off of the ground in order to create a forward

rotation which counteracts the initial backwards ro-

tation of the push. The character lands in a crouch

with its hands touching the ﬂoor in front of it to in-

crease stability. At higher push strength, the character

performs a roll onto its back (see Figure 2) because

the push is so strong that the character cannot gener-

ate enough forward rotation to counter it. However, a

https://bit.ly/4fVCwWH

backwards roll also helps to prolong the fall, spread-

ing out the impact over a longer period of time thereby

making it safer. The backwards roll can be seen in

real life in situations when one is falling backwards at

signiﬁcant speeds, e.g. when landing in parkour. Ad-

ditionally, another behavior emerges during the back-

wards roll, namely the character puts one of its arms

above and behind the head to halt the roll and prevent

it from going over the head.

Figure 2: Example motions from the two animation systems

compared in the user study. Top row - motion capture, bot-

tom row - our RL system.

4 USER STUDY

In order to test how our animation system compares

to a more established approach, we have conducted

a user study where we compare animations produced

by our approach to animations recorded via motion

capture. To enable us to effectively compare anima-

tions produced by our system with other types of an-

imation, we made a custom program using the Unity

game engine.

In order to direct focus onto the animations them-

selves instead of the 3D humanoid model, we use

the same, abstract humanoid model that comes with

MuJoCo for both animation systems. This also helps

to avoid presentation differences between the motion

capture’s 3D model and our system’s 3D model from

impacting user scores. Another reason for choosing to

use MuJoCo’s model over a more realistic humanoid

model is the fact that the animation data from mo-

tion capture can be trivially mapped onto MuJoCo’s

simpler humanoid model whereas there is no simple

and unique mapping from joint positions in MuJoCo

onto a more complex humanoid model. Since Mu-

JoCo’s model has fewer degrees of freedom than the

animation skeleton of the motion capture animations

(it lacks hands and feet) the motion capture skeleton

would be left underactuated.

4.1 Program for Comparing

Animations

We designed a program for comparing animations

which showcases one animation system at a time with

GRAPP 2025 - 20th International Conference on Computer Graphics Theory and Applications

326

the ability to switch between animation systems at

will. These two animation systems are our RL ani-

mation system and a simple motion capture animation

system. The RL and motion capture systems are rep-

resented by two scenarios, Scenario X and Scenario

Y, respectively. The scenario names are not descrip-

tive so as not to give away the nature of the animation

system currently in use, since this could inadvertently

inﬂuence the participant’s ratings.

The main view of the program can be seen on Fig-

ure 3 where the character is placed near the middle of

the screen with UI elements on the right side which

enable the user to push the character backwards. The

slider controls the strength at which the character will

be pushed and clicking the ”Push” button will initi-

ate a falling motion using the current animation sys-

tem. After an animation ﬁnishes, the character ap-

pears back at its original standing spot, ready to be

pushed again.

Figure 3: Main view of the animation comparison program.

The animation system that uses motion capture is

constructed using an animation state machine consist-

ing of 4 states: idle, light, medium and hard. Depend-

ing on the the strength of the push, the system tran-

sitions from the idle state into one of the other three.

This way, the motion capture animation system can

still react in proportion to the strength of the push, but

not as granularly as the RL animation system. To load

and play motion capture animations, the program uses

Unity’s inbuilt functionality. The position and rota-

tion of the animation bones are read from the anima-

tion clip and the 3D model is manipulated to exactly

match the current pose from the animation clip.

On the other hand, to play animations from Mu-

JoCo, the program has to communicate with a run-

ning instance of MuJoCo’s physics simulation. On

each new frame in Unity, the program requests and

subsequently receives pose data of the humanoid (in

the form of the position and rotation of each bone)

from the running physics simulation. This pose data

is then used to make the 3D model assume the exact

pose that the simulated humanoid currently has in the

physics simulation. Upon clicking the ”Push” button,

the program sends the strength of the push (read from

slider) to the MuJoCo instance which resets the envi-

ronment with the received push strength as the param-

eter and starts simulating. When the episode ﬁnishes,

the simulation pauses until another command to reset

the environment is received.

4.2 Participants

We involved 25 participants in the study (14 female

and 11 male). We chose the number of participants

in accordance with ITU-T recommendations (ITU-T,

2022). Participants are aged between 22-67 with the

average age being 30.16 and the median age being 28.

The participants report their level of familiarity

with 3D action video-games on a 5-point Likert scale

with ”Never played” on the lowest end of the scale

and ”Played extensively” on the highest end of the

scale. The genres of sports games and combat ori-

ented games were listed as examples of 3D action

video-games. Participants also report how important

animation realism and animation variety are to them

in the context of video games, also on a 5-point Likert

scale with ”Not important at all” on the lowest end of

the scale and ”Very important” on the highest end of

the scale. The distribution of the participants’ experi-

ence with 3D action video-games is shown in Table 1

and the reported importance of animation realism and

animation variety is shown in Table 2.

Table 1: Participants’ experience with 3D action video-

games.

familiarity level

1 2 3 4 5

# of responses 5 3 4 6 7

Table 2: Participants’ reported importance of animation re-

alism and animation variety.

importance

1 2 3 4 5

realism 0 1 8 9 7

variety 0 5 6 11 3

Participants were also asked whether they had par-

ticipated in activities where falling is commonplace

(e.g. martial arts, rollerblading, gymnnastics, park-

our etc.) because of the assumption that those who do

have experience with falling might answer questions

about safe falling differently to those who do not.

4.3 Questionnaire

Our questionnaire consists of eight questions whose

aim is to assess the users’ personal opinion on 8

different aspects of the animations produced by our

Beyond the Ragdoll: Designing a Safe-Falling Controller for Physically Simulated Characters

327

RL animation system, namely animacy, anthropo-

morphism, gracefulness, surprise, likeliness, injury,

realism and reactivity.

Questions about animacy and anthropomorphism

are meant to gauge whether the character passes as

human, questions about gracefulness and injury are

meant to gauge the perceived level of the character’s

skill, questions about surprise, likeliness and real-

ism are meant to gauge whether the character is con-

vincing, and the question about reactivity is meant to

gauge whether our RL approach is signiﬁcantly more

interactive than motion capture.

Inspired by the Godspeed Questionnaire for Hu-

man Robot Interaction (Bartneck et al., 2009), we

opted to use 5-point semantic differential scales

meaning that participants were asked to indicate

their position on a scale between two bipolar

words, the anchors. Five questions were posed

as semantic differential scales: animacy (Me-

chanical/Organic), anthropomorphism (Not human-

like/Humanlike), gracefulness (Clumsy/Graceful),

surprise (Surprising/Expected) and reactivity (Un-

changing/Reactive). The question about animacy

was taken directly from the Godspeed Questionnaire,

whilst the one about anthropomorphism was adapted

from the questionnaire. We used 5-point Likert scales

for the other three aspects by posing the following

questions - ”Is a person likely to fall similarly to

how the character fell?” for likeliness, ”Does it look

like the character injured themselves?” for injury, and

”Does it look like the character’s movements follow

the laws of physics?” for realism.

4.4 Study Procedure

Testing begins with participants signing a written con-

sent form that describes the kinds of data that is col-

lected. Further, participants are asked to provide de-

mographic information (gender and age) and to de-

scribe their level of experience with 3D action video-

games, and how important animation realism and ani-

mation variety are to them. Additionally, participants

are asked whether they have participated in an activity

where falling is commonplace

After providing demographic data, the partici-

pants begin using the animation comparison program

which randomly shows them one of two scenarios.

Participants are instructed to push the character as

many times as they want while varying the push

strength with the aim of developing an opinion on the

character’s movement. After the participants feel that

they have watched the character fall a sufﬁcient num-

ber of times, they ﬁll out the questionnaire described

in 4.3. Once they’ve ﬁlled out the questionnaire, the

participants return to the program and switch to the

other scenario. They repeat the process of forming

an opinion on the character’s movement after which

they again ﬁll out the same questionnaire as before,

this time rating the animation system from the sec-

ond scenario. Finally, after having ﬁlled out question-

naires for both scenarios, they choose which scenario

they prefer overall - Scenario X, Scenario Y or No

preference.

4.5 Study Results

The participants’ ratings for each question in the two

questionnaires can be seen in Figure 4 in the form of

a diverging bar chart. The proportions of responses to

each of the questions are shown in their corresponding

stacked bar chart and all of the stacked bar charts’

”neutral” sections (representing the proportion of ”3”

ratings on a 5-point scale) are aligned in the middle

to allow for a better visual overview. The more a bar

chart is to the right, the more participants answered

towards the right part of the scale, and vice versa.

What can be seen is that for every question except

for injury and reactivity, the motion capture system

received higher scores. Comparing whether there is a

difference between systems by running a Wilcoxon

Signed-Rank Test on each pair of questions, it is

shown that there does exist a statistically signiﬁcant

difference (Organic p = 0.002, Humanlike p < 0.001,

Graceful p = 0.042, Expected p = 0.002, Likely p <

0.001, Realism p = 0.002, Reactivity p = 0.038) for

every question except (Injury p = 0.94), even after

adjusting p-values via the Benjamini-Hochberg pro-

cedure because of testing multiple hypotheses. This

is consistent with the answers of overall preference

where, out of 25 participants, 22 prefer the motion

capture system, 2 have no preference, and 1 prefers

the RL system.

Running ANOVA (Analysis of Variance) tests to

check whether falling experience inﬂuences injury

ratings and whether 3D action video-game experience

inﬂuences reactivity shows no statistically signiﬁcant

correlations (Injury p = 0.670, Reactivity p = 0.958)

In conclusion, the results indicate that users do

perceive the RL system to be more reactive, which is

in agreement with the stated aim of developing an RL

based animation system. On the other hand, the sys-

tem doesn’t produce animations that look safer than

motion capture and, additionally, all other aspects of

the animation suffer to varying degrees.

4.5.1 Participants’ Comments

Looking at participants’ comments about the RL sys-

tem, one of the more common comments is that the

GRAPP 2025 - 20th International Conference on Computer Graphics Theory and Applications

328

Figure 4: Participants’ ratings for each question on a diverging bar chart.

character is twitchy and too curled up after landing

from a push at low strength, with one participant com-

menting that it reminds them of insect behavior. Par-

ticipants also expressed confusion about where on the

body the character is pushed which might have made

it more difﬁcult for them to appropriately rate like-

liness and reactivity. For the motion capture sys-

tem, multiple participants have expressed that the fall

looks overly dramatic or like the character is faking

the push. For both systems, participants have com-

mented that high velocity falls look more natural.

5 DISCUSSION AND FUTURE

WORK

Regarding high velocity falls looking more natural in

both systems, for RL this could be the case because

the character rolls onto their back instead of just land-

ing in a crouch. For motion capture this could be the

case because at high velocities the stunt actor has less

room to act or dramatize since they have to take more

care not to hurt themselves.

Regarding lower animacy and anthropomorphism

scores for the RL system, we plan to improve this by

further tweaking the reward function to reduce any

twitchy motion. We also plan to make the humanoid

model more realistic by making individual parts of

the body more resilient or less resilient against large

forces in accordance with human anatomy, thereby

guiding the controller towards more realistic behav-

ior. Likeliness and anthropomorphism could also be

improved by involving human feedback in the train-

ing loop like is done in Reinforcement Learning from

Human Feedback or through a method similar to (Ha

and Liu, 2014) where commands like ”straighten the

legs more” could be issued to the optimizer.

The majority overall preference for the motion

capture system may be explained by the choice of

a ﬂat environment without any obstacles scattered

throughout which favors motion capture since it is

likely the exact environment in which the original mo-

tion was captured by the performer. On the other

hand, a procedural approach, such as the one we de-

veloped, shows its strength in more complex, unpre-

dictable environments with uneven ground or obsta-

cles in the way of the fall. In such complex en-

vironments, motion capture animations might clip

through or get stuck on surrounding geometry with-

out the appropriate physical reaction that procedural

approaches can provide. In our future work we plan to

train the controller in environments that could involve

uneven terrain, slopes, obstacles, drops, etc. These

more complex environments might, however, make

convergence more difﬁcult.

An alternate procedural approach that focuses on

realistic motion would be to use motion capture as a

baseline by employing imitation learning, with rein-

forcement learning allowing for the introduction of

reactivity, similarly to (Luo et al., 2020), although

this might limit the controller to only those scenarios

available in motion capture which is almost always on

ﬂat ground.

Ultimately, despite the RL system’s on average

lower scores on various aspects compared to the mo-

tion capture system, our RL animation system can

nevertheless be considered as a viable alternative to

motion capture in cases where reactivity and anima-

tion variety are the primary concerns. This is often

the case in games which have motion controls, no-

tably VR games, that allow the player to interact with

Beyond the Ragdoll: Designing a Safe-Falling Controller for Physically Simulated Characters

329

virtual characters at a much more granular level than

in traditional games. Nevertheless, further efforts in

improvment of overall acceptability of RL generated

animations are needed.

6 CONCLUSION

In this paper we describe the design of a safe falling

animation system based on reinforcement learning

and present the results of a user study comparing said

animation system with an animation system based on

motion capture.

Our animation system consists of a reinforcement

learning agent that controls an articulated, fully phys-

ically simulated 3D humanoid character. The charac-

ter is pushed backwards while standing on ﬂat ground

and the controller tries to minimize the impact of the

character with the ground and the forces inside the

character’s joints. The PPO algorithm is used for

training the controller and curriculum learning is em-

ployed in order to help with convergence.

We conduct a user study comparing the RL ap-

proach with a motion capture approach whilst keep-

ing the surface presentation the same by using the

same humanoid 3D model in both animation systems.

The participants are instructed to make the charac-

ter fall by pushing it at varying strengths and to de-

velop an opinion on the resulting movement. They

are subsequently given a questionnaire in which they

rate the character’s movement on 8 different aspects

on 5-point semantic differential scales. This testing

procedure is carried out for both animation systems

individually.

Results show a statistically signiﬁcant difference

in ratings between the two systems for all but one as-

pect (injury), with the motion capture system being

rated more favorably in 6 out of 8 aspects, and the

RL system being rated more favorably in reactivity.

The impact of background variables (experience with

falling, previous gaming experience) on ratings of in-

jury and reactivity, respectively, is shown to not be

statistically signiﬁcant.

ACKNOWLEDGEMENTS

This research was fully supported by the Croatian

National Recovery and Resilience Plan (NPOO) un-

der the project Research and Development of Multi-

ple Innovative Products, Services and Business Mod-

els Aimed at Strengthening Sustainable Tourism and

the Green and Digital Transition of Tourism (ROBO-

CAMP), with grant number NPOO.C1.6.R1-I2.01.

REFERENCES

Bartneck, C., Kuli

c, D., Croft, E., and Zoghbi, S. (2009).

Measurement instruments for the anthropomorphism,

animacy, likeability, perceived intelligence, and per-

ceived safety of robots. Intl. journal of social robotics,

1:71–81.

Ha, S. and Liu, C. K. (2014). Iterative training of dynamic

skills inspired by human coaching techniques. ACM

Transactions on Graphics (TOG), 34(1):1–11.

Ha, S. and Liu, C. K. (2015). Multiple contact planning

for minimizing damage of humanoid falls. In 2015

IEEE/RSJ Intl. Conference on Intelligent Robots and

Systems (IROS), pages 2761–2767. IEEE.

Ha, S., Ye, Y., and Liu, C. K. (2012). Falling and landing

motion control for character animation. ACM Trans-

actions on Graphics (TOG), 31(6):1–9.

ITU-T (2022). P. 910: Subjective video quality assessment

methods for multimedia applications. pages 15–16.

Kumar, V. C., Ha, S., and Liu, C. K. (2017). Learning a uni-

ﬁed control policy for safe falling. In 2017 IEEE/RSJ

Intl. Conference on Intelligent Robots and Systems

(IROS), pages 3940–3947. IEEE.

Luo, Y.-S., Soeseno, J. H., Chen, T. P.-C., and Chen, W.-C.

(2020). Carl: Controllable agent with reinforcement

learning for quadruped locomotion. ACM Transac-

tions on Graphics (TOG), 39(4):38–1.

Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor,

M. E., and Stone, P. (2020). Curriculum learning for

reinforcement learning domains: A framework and

survey. The Journal of Machine Learning Research,

21(1):7382–7431.

Rossini, L., Henze, B., Braghin, F., and Roa, M. A. (2019).

Optimal trajectory for active safe falls in humanoid

robots. In 2019 IEEE-RAS 19th Intl. Conference on

Humanoid Robots (Humanoids), pages 305–312.

Tassa, Y., Erez, T., and Todorov, E. (2012). Synthesis and

stabilization of complex behaviors through online tra-

jectory optimization. In 2012 IEEE/RSJ Intl. Confer-

ence on Intelligent Robots and Systems, pages 4906–

4913. IEEE.

Tessler, C., Kasten, Y., Guo, Y., Mannor, S., Chechik, G.,

and Peng, X. B. (2023). Calm: Conditional adver-

sarial latent models for directable virtual characters.

In ACM SIGGRAPH 2023 Conference Proceedings,

pages 1–9.

Yin, Z., Yang, Z., Van De Panne, M., and Yin, K. (2021).

Discovering diverse athletic jumping strategies. ACM

Transactions on Graphics (TOG), 40(4):1–17.

Yu, R., Park, H., and Lee, J. (2019). Figure skating simu-

lation from video. In Computer graphics forum, vol-

ume 38, pages 225–234. Wiley Online Library.

Yu, R., Park, H., and Lee, J. (2021). Human dynamics from

monocular video with dynamic camera movements.

ACM Transactions on Graphics (TOG), 40(6):1–14.

Yun, S.-k., Goswami, A., and Sakagami, Y. (2009). Safe

fall: Humanoid robot fall direction change through in-

telligent stepping and inertia shaping. In 2009 IEEE

intl. conference on robotics and automation, pages

781–787. IEEE.

GRAPP 2025 - 20th International Conference on Computer Graphics Theory and Applications

330