Impact of Code Obfuscation on Android Malware Detection based on

Static and Dynamic Analysis

Alessandro Bacci

, Alberto Bartoli

, Fabio Martinelli

, Eric Medvet

Francesco Mercaldo

and Corrado Aaron Visaggio

Dipartimento di Ingegneria e Architettura, Universit

a degli Studi di Trieste, Trieste, Italy

Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy

Dipartimento di Ingegneria, Universit

a degli Studi del Sannio, Benevento, Italy

Keywords:

Malware, Android, Machine Learning, Code Obfuscation, Security.

Abstract:

The huge diffusion of malware in mobile platform is plaguing users. New malware proliferates at a very fast

pace: as a matter of fact, to evade the signature-based mechanism implemented in current antimalware, the

application of trivial obfuscation techniques to existing malware is sufﬁcient. In this paper, we show how

the application of several morphing techniques affects the effectiveness of two widespread malware detection

approaches based on Machine Learning coupled respectively with static and dynamic analysis. We demon-

strate experimentally that dynamic analysis-based detection performs equally well in evaluating obfuscated

and non-obfuscated malware. On the other hand, static analysis-based detection is more accurate on non-

obfuscated samples but is greatly negatively affected by obfuscation: however, we also show that this effect

can be mitigated by using obfuscated samples also in the learning phase.

1 INTRODUCTION

Malware targeting mobile platforms has been spread-

ing fastly and largely in the last years. This is an natu-

ral consequence of two facts, which constitute strong

incentives for many attackers: (i) users store more and

more sensitive and private information in their mobile

devices and (ii) mobiles, and Android-bases in partic-

ular, are becoming the most used devices: in March

2017, Android usage hit 37.93% while Windows on

computers hit 37.91%

This is the reason why antimalware vendors pro-

pose free and commercial solutions with the aim to

mitigate this widespread phenomenon, but the current

signature-based approach is not sufﬁcient to protect

users against the new threats developed by malware

writer (Canfora et al., 2015b; Rastogi et al., 2013a;

Zheng et al., 2013). As a matter of fact, signature-

based malware detection (the most common tech-

nique adopted by mobile antimalware) is often inef-

fective (Cimitile et al., 2017). Moreover it is costly:

the process for obtaining and classifying a malware

signature is laborious and time-consuming.

http://gs.statcounter.com/os-market-share#monthly-

201703-201703-map

In the last years, the research community has de-

veloped several methods in order to identify whether

a mobile application exhibits a malicious behaviour:

basically the approaches considered are based on

static analysis (the detection process does not require

the execution of the application) or on dynamic anal-

ysis (the detection process requires the application to

run in order to identify the maliciousness) (Tam et al.,

2017).

While several research papers evaluate the effec-

tiveness of the signature-based detection provided by

current antimalware technologies (Zheng et al., 2013;

Ramachandran et al., 2012; Rastogi et al., 2013a,b),

in this paper our aim is to evaluate the effectiveness

of the techniques considered by researchers against

the common code morphing techniques employed by

malware writers. In order to demonstrate this, we

evaluate two recent approaches based on Machine

Learning techniques operating on, respectively, fea-

tures derived from static analysis (Canfora et al.,

2015a) and dynamic analysis (Canfora et al., 2015c)

against a set of widespread morphing techniques. The

considered approaches are representative of the many

Machine Learning-based malware detection systems

which have been recently proposed (e.g., Xue et al.

Bacci, A., Bartoli, A., Martinelli, F., Medvet, E., Mercaldo, F. and Visaggio, C.

Impact of Code Obfuscation on Android Malware Detection based on Static and Dynamic Analysis.

DOI: 10.5220/0006642503790385

In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018), pages 379-385

ISBN: 978-989-758-282-0

379

(2017); Martinelli et al. (2017); Ferrante et al. (2016);

Medvet and Mercaldo (2016); Tam et al. (2017);

Backes and Nauman (2017); Demontis et al. (2017)).

The paper poses the following research question:

to which degree the widespread obfuscation tech-

niques affect the effectiveness of state-of-the-art de-

tection approaches for malware detection? We at-

tempt to answer this question by means of a thorough

experimental analysis involving a real-world dataset

composed by 3500 legitimate and 3500 real-world

malware applications and 8 different and Android-

speciﬁc morphing techniques.

2 RELATED WORK

There is an increasing interest in applying Machine

Learning-based techniques to the problem of Android

Malware detection: we here brieﬂy survey the most

recent ones, and other non ML-based signiﬁcant ones,

which explicitly consider code obfuscation.

A framework able to inject a set of morphing tech-

niques ha been proposed by Rastogi et al. (2013a)

with the aim to evaluate the current antimalware tech-

nologies against morphed variants of malware. The

main outcome of the paper is that all the studied anti-

malware software are vulnerable to trivial code trans-

formations.

Rastogi et al. (2014) evaluate ten antimalware

tools using six original and morphed mobile malware

belonging to six different families. The authors con-

clude that the antimalware are susceptible to common

widespread evasion techniques.

Suarez-Tangil et al. (2016a) propose DroidSieve,

an Android malware classiﬁer based on static analy-

sis, and identify two high-level classes: (i) resource-

centric features which are derived from resources

used by the app and (ii) syntactic features which

are derived from the code and metadata of mo-

bile applications. The proposed approach con-

sider obfuscation-invariant features and artefacts in-

troduced by obfuscation mechanisms used by mobile

malware writers.

Alterdroid (Suarez-Tangil et al., 2016b) is a mal-

ware analysis framework consisting in the analysis of

the behavioral differences between the original appli-

cation and a set of automatically generated versions of

it, where a number of modiﬁcations have been care-

fully injected (the so-called variants). In addition, Al-

terdroid performs a dynamic analysis (i.e., every app

was executed over a time span equal to 120 seconds)

to identify the malware.

O’kane et al. (2016) investigate the optimal set of

instruction being executed to identify obfuscated An-

droid malware using the SVM classiﬁer. They ﬁnd a

set of instructions that are good indicators of malware

and determine how long the program needs to run in

order to obtain an accurate classiﬁcation. They obtain

an average accuracy equal to 84.4%.

The RevealDroid tool (Garcia et al., 2015) is

stated to be obfuscation resilient thanks to a set of fea-

tures including sensitive APIs and intents usage and

information ﬂows. The effectiveness of the selected

features is evaluated using two different simple classi-

ﬁers, which obtain an accuracy ranging between 93%

and 96% in malware detection.

3 MACHINE LEARNING-BASED

MALWARE DETECTION

We consider two forms of detection based on Ma-

chine Learning techniques applied on data derived

from static and dynamic analysis, i.e., on sequences

of opcodes and system calls, respectively. We build

our study on two state-of-the-art approaches (Canfora

et al., 2015a,c) which we brieﬂy describe in the fol-

lowing sections.

In both cases, the approach consists of a classiﬁ-

cation phase, in which an input application a is clas-

siﬁed as malware or trusted, and a learning phase in

which a classiﬁer is trained basing on two sets A

and A

including, respectively, trusted and malware

applications. In both phases, a numeric feature vec-

tor is computed out of the app a by means of a pre-

processing step. All the procedures are described be-

low.

3.1 Static Analysis

The pre-processing of an app a starts by extracting the

.dex ﬁle from a packed as an .apk ﬁle. Then, several

ﬁles containing the machine level instructions, each

consisting in an opcode and its parameters, are ob-

tained from the .dex ﬁle by means of decompilation.

From these ﬁles, a list of sequences of opcodes (with-

out the parameters), where a sequence corresponds to

a method of a class in the app, is extracted. Finally,

the frequency f (a, o) of each ngram o occurring in the

sequences of the list is computed, n being a parame-

ter of the method. The resulting vector is the initial

feature vector for a.

In the learning phase, an feature selection proce-

dure is performed, since the feature vector obtained

through the pre-processing phase may be remarkably

large. This is done proceeding as follow. For each

ngrams o, its global frequencies relatively to A

(set

ICISSP 2018 - 4th International Conference on Information Systems Security and Privacy

380

of trusted apps) and A

(set of malware apps) are

computed:

(o) =

∑

a∈A

f (a, o) (1)

(o) =

∑

a∈A

f (a, o) (2)

The relative difference d(o) is obtained as:

d(o) =

abs(

(o) −

(o))

max(

(o),

(o))

(3)

The set O of the selected ngrams (and hence the cor-

responding features) is hence built to include the k

ngrams with the highest values of d(o), where k is

a parameter of the method. The ngrams for which

d(o) = 1 (i.e., those ngrams which occur only in

and not in A

or viceversa) are not considered

to avoid obtaining a classiﬁer that fails to general-

ize. Furthermore, all the ngrams in O that are sub-

sequences of another ngram in O are discarded—this

way, redundant information is removed. Finally, only

the k

< k ngrams in O with the highest d(o) are re-

tained, where k

is a parameter of the method. At the

end of the learning phase, a binary classiﬁer based on

Support Vector Machine (SVM) with a Gaussian ker-

nel and a cost c = 1 is learned on the dataset deriving

from A

, A

and the features determined by O.

In the classiﬁcation phase, the feature vector for

the input app a is ﬁrst computed considering the fre-

quences of the opcodes in O; then, it is given as in-

put to the trained SVM which outputs a response in

{malware, trusted}.

3.2 Dynamic Analysis

In the pre-processing, the system calls invoked by the

app a during the execution are recorded, producing

an execution trace. Then the feature vector is ex-

tracted calculating the frequency over a of each pos-

sible ngram of system calls (w/o the call arguments),

where n is a parameter of the method.

As in the static case, the learning phase starts with

a feature selection procedure. To reduce the number

of features, only the k ngrams with the greatest δ

are

selected, with:



∑

a∈A

f (a, s)−

∑

a∈A

f (a, s)



max

a∈A

f (a, s)

where s is an ngram of system calls, A is the union of

(set of trusted apps) and A

(set of malware apps),

and k is a parameter of the method. The number of

features is further reduced by computing, for each re-

maining s, the mutual information I

of f (a, s) with

the label of a for any a ∈ A and retaining the k

fea-

tures with the highest I

, with k

being a parameter of

the method, resulting in a set S of selected ngrams. At

the end of the learning phase, a binary classiﬁer based

on Support Vector Machine (SVM) with a Gaussian

kernel and a cost c = 1 is learned on the dataset deriv-

ing from A

, A

and the features determined by S .

In the classiﬁcation phase the previously selected

features are extracted from the apps in the unlabelled

dataset, on which the trained classiﬁer is applied, re-

ceiving a response label in {malware, trusted}.

4 EXPERIMENTAL EVALUATION

4.1 Data

We built a dataset of 7000 applications evenly divided

between trusted (A

) and malware (A

). In particu-

lar, we took a subset of the dataset used in (Canfora

et al., 2015a) in which trusted apps were collected

from Google Play and malware apps from the Drebin

dataset Arp et al. (2014). Furthermore, we built a set

of obfuscated malware apps set by applying to

each of the apps in A

all the obfuscation techniques

described in the next section.

For the dynamic analysis detection, we executed

each app on a real Android device for at most 1

minute, during which a tool was simulating random

UI interactions for the whole minute of execution.

4.2 The Obfuscation Techniques

Android runs Dalvik executables stored in .dex ﬁles.

In order to apply transformations to application code,

we obtained the smali (a human readable dalvik byte-

code) representation of the code, using apktool

, a

tool for reverse engineering which allows to decom-

pile and recompile Android applications. apktool is

able to decode resources to nearly original form and

rebuild them after making some modiﬁcations. The

smali representation is the target of the transforma-

tions we considered.

We designed, implemented, and publicly re-

leased

a Java tool able to apply a set code modiﬁ-

cations to smali representation in an automated way.

We applied all the following morphing tech-

niques:

1. Disassembling & Reassembling. The compiled

Dalvik bytecode in classes.dex of the applica-

tion package may be disassembled and reassem-

http://ibotpeaches.github.io/Apktool/

https://github.com/faber03/AndroidMalwareEvaluatingTools

Impact of Code Obfuscation on Android Malware Detection based on Static and Dynamic Analysis

381

bled through apktool. This allows various items in

a .dex ﬁle to be represented in a different way. In

this way signatures relying on the order of differ-

ent items in the .dex ﬁle will likely be ineffective

with this transformation.

2. Repacking. Every Android application contains a

developer signature key that will be lost after dis-

assembling the application and then reassembling

it. In order to create a new key we consider the

signapk

tool to embed a new signature key in the

reassembled app to avoid detection signatures that

match the developer keys.

3. Changing Package Name. Each Android appli-

cation is identiﬁed by a unique package name.

This transformation is focused at renaming the ap-

plication package name in both the Android Man-

ifest and all the classes of the app, to elude detec-

tion by signatures based on package name.

4. Identiﬁer Renaming. To avoid detection signa-

tures relying on identiﬁer names, this transforma-

tion renames each package name and class name

by using a random string generator, in both An-

droid Manifest and smali classes, handling re-

named classes invocations.

5. Data Encoding. The dex ﬁles contain all the

strings and arrays used in the code. Strings could

be used to create detection signatures to identify

malware. To elude such signatures, this transfor-

mation encodes strings with a Caesar cipher with

a ﬁxed key equal to 3. This technique is also ap-

plied to the code of the so-called metamorphic

malware Borello and M

e (2008); Canfora et al.

(2014). The original string will be restored during

application run-time.

6. Call Indirections. Some detection signatures

could exploit the call graph of the application.

To evade such signatures we designed a transfor-

mation which mutates the original call graph, by

modifying every method invocation in the smali

code with a call to a new method inserted by the

transformation which simply invokes the original

method.

7. Code Reordering. This transformation is aimed

at modifying the instructions order in smali meth-

ods. A random reordering of instructions has been

accomplished by inserting goto instructions with

the aim of preserving the original runtime exe-

cution trace. Considering that the reordering is

random, this is considered the strongest obfusca-

tion technique able to alter the signature provided

by current antimalware technologies You and Yim

https://code.google.com/p/signapk/

(2010). The transformation was applied only to

methods that do not contain any type of jumps

(i.e., if, switch, recursive calls).

8. Junk Code Insertion. These transformations in-

troduce code sequences that have no effect on the

business logic of applications. This is considered

a weak technique, for this reason usually antimal-

ware technologies can be able to identify samples

obfuscated only with this technique Collberg et al.

(2003). The transformation provides three differ-

ent junk code insertions: (i) insertion of nop in-

structions into each method, (ii) insertion of un-

conditional jumps into each method, and (iii) al-

location of three additional registers on which

garbage operations are performed.

4.3 Procedure and Results

We performed a 10-fold cross validation, i.e., we:

(i) randomly split the sets A

and A

in 10 par-

tition; (ii) built the sets A

and A

by including

9 on the 10 partitions in A

and A

, respectively;

(iii) we performed the learning phase on A

and A

as described in Sections 3.2 and 3.2; (iv) for each

a ∈ A

∪ A

and not in A

∪ A

(i.e., for each

app in the testing set), we applied the learned classi-

ﬁer.

We repeated steps ii, iii, and iv 10 times by vary-

ing the excluded partition. For the dynamic case, we

collected 10 execution traces for each app (used both

in the learning and classiﬁcation phases, with traces

for the same app randomly distributed in the learn-

ing and testing sets) in order to mitigate the impact of

fortunate and unfortunate conditions during the exe-

cution. We set n = 3, k = 5000, and k

= 2000 for the

static case and n = 3, k = 2000, and k

= 750 for the

dynamic case, basing on the results of the two corre-

sponding original papers.

We measured the classiﬁcation effectiveness in

terms of Accuracy, i.e., the percentage of correctly

classiﬁed apps, False Positive Rate (FPR), i.e., the

percentage of trusted apps classiﬁed as malware, and

False Negative Rate (FNR), i.e., the percentage of

malware apps classiﬁed as trusted. All the results are

shown in Table 1: FNR is cast as FNR

¬O

and FNR

i.e., measured on non-obfuscated malware apps (a ∈

\ A

) and obfuscate malware apps (a ∈ A

), re-

spectively. Figure 1 shows True Positive Rate (TPR)

and True Negative Rate (TNR) indexes for each of

the 10 repetitions—TNR is the average of TNR

¬O

and

TNR

obtained in the repetition.

It can be seen from Table 1 that both methods

(i.e., static anlysis-based and dynamic analysis-based

detection) are effective in classifying non-obfuscated

ICISSP 2018 - 4th International Conference on Information Systems Security and Privacy

382

Table 1: Mean µ and standard deviation σ of FNR and

FPR across the 10 repetitions in the two learning scenarios:

without (above) or with (below) obfuscated malware apps

in the training set.

FPR FNR

¬O

FNR

Method µ σ µ σ µ σ

w/o

Static 3.7 1.3 2.6 0.8 89.8 0.2

Dyn. 9.9 1.0 5.8 1.4 7.5 0.3

Static 6.6 1.8 0.6 0.1 0.1 0.1

Dyn. 10.7 1.0 3.2 0.2 4.5 0.2

1 2 3 4

5 6

7 8 910

0.2

0.4

0.6

0.8

Dynamic

W/O

1 2 3 4

5 6

7 8 910

0.2

0.4

0.6

0.8

Static

1 2 3 4

5 6

7 8 910

0.2

0.4

0.6

0.8

1 2 3 4

5 6

7 8 910

0.2

0.4

0.6

0.8

TPR TNR

Figure 1: Effectiveness for each of the 10 repetitions in term

of TPR and TNR of both static and dynamic analysis. Re-

sults are shown in the case with and without obfuscated mal-

ware in the training set.

apps, FPR and FNR

¬O

being lower than 10%. Static

analysis is indeed more accurate, with an FPR < 4%

and FNR

¬O

< 3%, whereas dynamic analysis scores

≈ 10% and 6% respectively: the accuracy of the lat-

ter is negatively affected by the variability of execu-

tions which essentialy results in noisy data. These

ﬁgures are consistent with the results of Canfora et al.

(2015a) and Canfora et al. (2015c).

The most interesting ﬁnding concerns, however,

the impact of obfuscation on malware detection. By

observing the difference between FNR

¬O

and FNR

in the two topmost rows of Table 1, it can be seen

that the effectiveness of static analysis-based detec-

tion is severely affected by obfuscation, whereas dy-

namic analysis-based effectiveness is not signiﬁcantly

affected. For static method, FNR

≈ 90%, i.e., 9 on

10 malware apps are wrongly classiﬁed as trusted.

This can be explained by the fact that the obfuscation

techniques applied in this study heavily modify fre-

quency and order of the opcodes in an app, especially

in the case of Call indirections, Code reordering, and

Junk code insertion. This leads to a completely dif-

ferent distribution of ngrams that is no longer recog-

nized by the static classiﬁer. Instead, execution traces

of an obfuscated app are very similar to their non-

obfuscated counterpart, therefore the dynamic classi-

ﬁer is not inﬂuenced by obfuscation. In essence, this

experiment conﬁrms the high level intuition that dy-

namic analysis-based detection is much more robust

to code obfuscation than static one.

4.3.1 Learning on Obfuscated Malware

Basing on the results of our ﬁrst experimentation, we

decided to investigate if the scarce robustness to ob-

fuscation of the static analysis-based detection may be

mitigated. In other words, we tried to address the high

level research question: are features based on ngrams

of obcodes able to capture the essence of malware

even in case of obfuscation? To answer this question

experimentally, we modiﬁed the experimental proce-

dure such that the learning set A

consists of an even

number of apps from the set A

of non-obfuscated

malware apps and from the set A

of obfuscated mal-

ware apps, with |A

| = |A

|—again, apps used for

learning are never used for assessing classiﬁcation ef-

fectiveness.

Table 1 presents—in the two bottom rows—the

results in terms of FPR, FNR

¬O

, and FNR

of the

two methods with the obfuscated malware apps in the

learning set.

It can be seen that simply making obfuscated mal-

ware available to the learning process makes static

analysis-based detection clearly robust to obfusca-

tion: FNR

¬O

and FNR

are both very low (0.6%

and 0.1%, respectively), whereas FPR is only slightly

higher than with the case of non-obfuscated only

learning. In other words, features based on ngrams

of obcodes are adequate for capturing the essence of

malware regardless of obfuscation, but samples of ob-

fuscated malware must be available for the learning

phase.

Concerning the dynamic method, effectiveness in-

dexes deviate only moderately with, in general, lower

FNR and higher FPR.

5 CONCLUSION AND FUTURE

WORK

In this work, we compared the robustness to code ob-

fuscation of two different malware detection meth-

ods, based on Machine Learning techniques applied

on features deriving from static (opcodes in machine

Impact of Code Obfuscation on Android Malware Detection based on Static and Dynamic Analysis

383

leavel app code) and dynamic (system calls in app

execution trace) analysis. The underlying assump-

tion is that obfuscating the code of an app should

leave its execution trace almost unchanged, making

a dynamic classiﬁer robust to obfuscation, but should

change completely the sequence of opcodes deriving

from its code, making a static classiﬁer totally inef-

fective. We experimentally validated this assumption

by applying two state-of-the-art methods to legitimate

apps, malware apps, and malware apps subjected to

8 different code morphing techniques: results show

that static analysis-based detection is essentially un-

effective on obfuscated malware. We also showed

that static detection may be made robust to obfusca-

tion by making obfuscated malware apps available for

the learning. In the future, we plan to study if and to

which degree static and dynamic detection are able to

correctly classify apps subjected to new code morph-

ing techniques, i.e., techniques for which no samples

were available in the learning phase.

ACKNOWLEDGEMENTS

This work has been partially supported by H2020

EU-funded projects NeCS and C3ISP and EIT-Digital

Project HII.

REFERENCES

Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H.,

Rieck, K., and Siemens, C. (2014). Drebin: Effec-

tive and explainable detection of android malware in

your pocket. In NDSS.

Backes, M. and Nauman, M. (2017). Luna: Quantifying

and leveraging uncertainty in android malware anal-

ysis through bayesian machine learning. In Security

and Privacy (EuroS&P), 2017 IEEE European Sym-

posium on, pages 204–217. IEEE.

Borello, J.-M. and M

e, L. (2008). Code obfuscation tech-

niques for metamorphic viruses. Journal in Computer

Virology, 4(3):211–220.

Canfora, G., De Lorenzo, A., Medvet, E., Mercaldo, F.,

and Visaggio, C. A. (2015a). Effectiveness of opcode

ngrams for detection of multi family android mal-

ware. In Availability, Reliability and Security (ARES),

2015 10th International Conference on, pages 333–

340. IEEE.

Canfora, G., Di Sorbo, A., Mercaldo, F., and Visag-

gio, C. A. (2015b). Obfuscation techniques against

signature-based detection: a case study. In Mobile

Systems Technologies Workshop (MST), 2015, pages

21–26. IEEE.

Canfora, G., Medvet, E., Mercaldo, F., and Visaggio, C. A.

(2015c). Detecting android malware using sequences

of system calls. In Proceedings of the 3rd Interna-

tional Workshop on Software Development Lifecycle

for Mobile, pages 13–20. ACM.

Canfora, G., Mercaldo, F., Visaggio, C. A., and Di Notte, P.

(2014). Metamorphic malware detection using code

metrics. Information Security Journal: A Global Per-

spective, 23(3):57–67.

Cimitile, A., Martinelli, F., Mercaldo, F., Nardone, V.,

and Santone, A. (2017). Formal methods meet mo-

bile code obfuscation identiﬁcation of code reorder-

ing technique. In Enabling Technologies: Infrastruc-

ture for Collaborative Enterprises (WETICE), 2017

IEEE 26th International Conference on, pages 263–

268. IEEE.

Collberg, C. S., Thomborson, C. D., and Low, D. W. K.

(2003). Obfuscation techniques for enhancing soft-

ware security. US Patent 6,668,325.

Demontis, A., Melis, M., Biggio, B., Maiorca, D., Arp,

D., Rieck, K., Corona, I., Giacinto, G., and Roli, F.

(2017). Yes, machine learning can be more secure! a

case study on android malware detection. IEEE Trans-

actions on Dependable and Secure Computing.

Ferrante, A., Medvet, E., Mercaldo, F., Milosevic, J., and

Visaggio, C. A. (2016). Spotting the malicious mo-

ment: Characterizing malware behavior using dy-

namic features. In Availability, Reliability and Secu-

rity (ARES), 2016 11th International Conference on,

pages 372–381. IEEE.

Garcia, J., Hammad, M., Pedrood, B., Bagheri-Khaligh,

A., and Malek, S. (2015). Obfuscation-resilient, ef-

ﬁcient, and accurate detection and family identiﬁca-

tion of android malware. Department of Computer

Science, George Mason University, Tech. Rep.

Martinelli, F., Marulli, F., and Mercaldo, F. (2017). Eval-

uating convolutional neural network for effective mo-

bile malware detection. Procedia Computer Science,

112(C):2372–2381.

Medvet, E. and Mercaldo, F. (2016). Exploring the usage

of topic modeling for android malware static analy-

sis. In Availability, Reliability and Security (ARES),

2016 11th International Conference on, pages 609–

617. IEEE.

O’kane, P., Sezer, S., and McLaughlin, K. (2016). Detect-

ing obfuscated malware using reduced opcode set and

optimised runtime trace. Security Informatics, 5(1):1–

12.

Ramachandran, R., Oh, T., and Stackpole, W. (2012). An-

droid anti-virus analysis. In Annual Symposium on

Information Assurance & Secure Knowledge Manage-

ment, pages 35–40.

Rastogi, V., Chen, Y., and Jiang, X. (2013a). Droid-

chameleon: evaluating android anti-malware against

transformation attacks. In Proceedings of the 8th

ACM SIGSAC symposium on Information, computer

and communications security, pages 329–334. ACM.

Rastogi, V., Chen, Y., and Jiang, X. (2013b). Droid-

chameleon:evaluating android anti-malware against

transformation attacks. In ACM Symposium on In-

formation, Computer and Communications Security,

pages 329–334.

ICISSP 2018 - 4th International Conference on Information Systems Security and Privacy

384

Rastogi, V., Chen, Y., and Jiang, X. (2014). Catch me if you

can: Evaluating android anti-malware against trans-

formation attacks. IEEE Transactions on Information

Forensics and Security, 9(1):99–108.

Suarez-Tangil, G., Dash, S. K., Ahmadi, M., Kinder, J., Gi-

acinto, G., Cavallaro, L., Dash, S. K., Suarez-Tangil,

G., Khan, S., Tam, K., et al. (2016a). Droidsieve: Fast

and accurate classiﬁcation of obfuscated android mal-

ware. In Proc. 5th {ACM} Conf. Data and Application

Security, volume 7148, pages 43–50. IEEE.

Suarez-Tangil, G., Tapiador, J. E., Lombardi, F., and

Di Pietro, R. (2016b). Alterdroid: differential fault

analysis of obfuscated smartphone malware. IEEE

Transactions on Mobile Computing, 15(4):789–802.

Tam, K., Feizollah, A., Anuar, N. B., Salleh, R., and Cav-

allaro, L. (2017). The evolution of android malware

and android analysis techniques. ACM Comput. Surv.,

49(4):76:1–76:41.

Xue, Y., Meng, G., Liu, Y., Tan, T. H., Chen, H., Sun, J.,

and Zhang, J. (2017). Auditing anti-malware tools by

evolving android malware and dynamic loading tech-

nique. IEEE Transactions on Information Forensics

and Security.

You, I. and Yim, K. (2010). Malware obfuscation tech-

niques: A brief survey. In Broadband, Wireless Com-

puting, Communication and Applications (BWCCA),

2010 International Conference on, pages 297–300.

IEEE.

Zheng, M., Lee, P. P. C., and Lui, J. C. S. (2013). Adam:

An automatic and extensible platform to stress test an-

droid anti-virus systems. In Proceedings of the 9th In-

ternational Conference on Detection of Intrusions and

Malware, and Vulnerability Assessment, DIMVA’12,

pages 82–101.

Impact of Code Obfuscation on Android Malware Detection based on Static and Dynamic Analysis

385