different linguistic representations for different 
translators. 
They also state that modulation produces a 
translation that enables the target viewers to have 
better understanding. Accordingly, the topic of 
modulation technique is still worth investigating as it 
needs attention from the different sides of the 
researchers. More importantly, a further study on the 
modulation technique and interplay the translation 
need to be investigated. Hence, in this study, there 
are two folds that will be taken into account; the 
modulation itself and the multi-modes which 
influence the decision making in the translation 
process. 
Forrest Gump is interesting to be analyzed for 
there are intertextualities found in this film. 
Therefore some non-verbal texts must be taken into 
account by translator in the subtitle. The non-verbal 
auditory  texts  include  music,  natural  sound  and 
sound effects, gestures, facial expression, and body 
movements that will bring effect in the translation of 
the verbal texts (Chiaro, 2009; Gottlieb, 2009) 
This study aims to investigate the modulation 
technique employed by the translator in the 
Indonesian subtitled version of Forrest Gump and 
how multi-modes provide information to the 
translator in rendering the message from the English 
source text of Forrest Gump to the Indonesian target 
text. 
2 LITERATURE REVIEW 
In audio-visual translation, the word subtitling is 
defined as the rendering of the verbal message in 
filmic media in a different language, it usually 
consists of one or two lines written text and it can be 
visually synchronized with the original verbal text in 
the form of one or two lines of written text (Gottlieb, 
2009) 
According to Molina and Albir (2002), one of 
the translation techniques used in rendering the 
message is modulation. Modulation is a basic 
technique for translation that aims at simplifying the 
text for subtitles. Hoof  (1989) states that modulation 
is  like  transposition  at  the  global  level:  changing 
categories of thought, not grammatical categories. 
It refers to presenting the situation from a 
different perspective. It should be noticed that the 
sentence is represented with a different perspective, 
but the meaning remains the same. It is a technique 
in the translation to change the point of view, 
cognitive  category  from  the  source  text  into  the 
target text; it can be lexical or structural, e.g., to 
translate “Don’t’ litter,” instead of, “Jagalah 
kebersihan! (Keep clean!)”. In SCFA, such kind of 
technique is called acceptation (Molina & Hurtado 
Albir, 2002). The cognitive categories in modulation 
technique include the changes from abstract message 
in the source text into concrete message in the target 
text. Other types cover cause for effect, means for 
result, a part for the whole, negated contrary or 
positive  for  double  negative,  reversal  of  terms, 
active       for       passive       and       vice       versa, 
space for time,  intervals and limits, change of 
symbols (Vinay & Darbelnet, 1995). 
Subtitle as one kind of AVT products has a close 
relation with multi-modes found on the screen. The 
concept of “multimodality” is important for the 
multimodal communication; that is, the multiple 
modes of representation hugely affects in the 
meaning making process (Kress, 2005; Kress & Van 
Leeuwen,  1996).  Therefore,  a  translator  needs  to 
have a semiotic understanding for all signs in the 
images carrying meanings. Mass-produced images, 
now as readily available as printed or electronic 
words, present translators with a new challenge: to 
rethink  the  relationship  between  word  and  image 
(Gambier and Gottlieb, 2001). 
Using multimodal approach, modes work 
individually and collectively at the same time. This 
means “modes produce meaning in themselves and 
through their intersection or interaction with each 
other” (Kress & van Leeuwen, 2001). Furthermore, 
the  nature  of  audiovisual  text  is  multi-coded  in 
which it contains verbal and nonverbal channel, such 
as  image,  musicm  sounds,  noises  which  create  a 
coherent unity to make a viewer-friendly product. 
(Malenova, 2015). 
Individual texts use ‘different sign systems’, the 
overall multimodal newscast is also multisemiotic, 
therefore the connection and interaction between the 
various semiotic texts is also called inter-semiotic 
translation (Desjardins, 2008). 
In addition, Chuang (2006) states that all modes 
produce  meanings  through  their  interaction  with 
each other in the communicative context. The kinds 
of modes that should be considered are two types, 
namely   visual   modes   which   include   scenery, 
lighting, costumes, properties,    gestures, facial 
expressions, body movement,   and audio modes 
which include music, background noise, sound 
effects, laughter, crying, humming, body sounds.