How I Met Your Mother? - An Empirical Study about Android Malware Phylogenesis

Gerardo Canfora, Francesco Mercaldo, Antonio Pirozzi, Corrado Aaron Visaggio


Android malware is becoming more and more aggressive, in terms of impact on the victim’s device and in terms of capability of evading detection. Not only smartphones with their sensitive information are targeted by attackers, but also devices such as watches, glasses and everything that can be connected to the Internet of Things. Current signature based antimalware or anomaly based detection are not able to detect zero-day attacks: even trivial code transformation can overcome detection. New malware is often not really new: malware writers are used to add functionality to existing malware, or merge different pieces of existing malware code: this determines the families of Android malware i.e. malware programs that have in common some essential features or behaviors and modify some other parts. To be able to recognize the malware familiy a malware belongs to is useful for malware analysis, fast infection response, and quick incident resolution. In this paper we introduce DescentDroid, a tool that traces back the malware descendant family. We experiment our technique with an extended dataset comprising malware and trusted applications, obtaining high precision in recognizing the malware family membership.


  1. Agrawal, H., Bahler, L., Micallef, J., Snyder, S., and Virodov, A. (2012). Detection of global, metamorphic malware variants using control and data flow analysis. In Military Communications Conference, MILCOM 2012, pages 1-6.
  2. Androguard (2015).
  3. Arp, D., Spreitzenbarth, M., Huebner, M., Gascon, H., and Rieck, K. (2014). Drebin: Efficient and explainable detection of android malware in your pocket. In Proceedings of 21th Annual Network and Distributed System Security Symposium (NDSS).
  4. Azab, A., Layton, R., Alazab, M., and Oliver, J. (2014). Mining malware to detect variants. In 5th Cybercrime and Trustwortly Computing Cnference, pages 44-52.
  5. Battista, P., Mercaldo, F., Nardone, V., Santone, A., and Visaggio, C. A. (2016). Identification of android malware families with model checking. In International Conference on Information Systems Security and Privacy. SCITEPRESS.
  6. Bayer, U., Comparetti, P. M., Hlauschek, C., Kruegel, C., and Kirda, E. (2009). Scalable, behavior-based malware clustering. In NDSS, volume 9, pages 8-11. Citeseer.
  7. Canfora, G., Mercaldo, F., and Visaggio, C. (2016). Evaluating op-code frequency histograms in malware and third-party mobile applications. Communications in Computer and Information Science, 585:201-222.
  8. Canfora, G., Mercaldo, F., and Visaggio, C. A. (2015). Mobile malware detection using op-code frequency histograms. In Proceedings of International Conference on Security and Cryptography (SECRYPT).
  9. Carrera, E. and Erdélyi, G. (2004). Digital genome mapping-advanced binary malware analysis. In Virus bulletin conference, volume 11.
  10. Cesare, S. and Xiang, Y. (2011). Malware variant detection using similarity search over sets of control flow graphs. In IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pages 181-189.
  11. Chen, X., Andersen, J., Mao, Z. M., Bailey, M., and Nazario, J. (2008). Towards an understanding of antivirtualization and anti-debugging behavior in modern malware. In Dependable Systems and Networks With FTCS and DCC, 2008. DSN 2008. IEEE International Conference on, pages 177-186. IEEE.
  12. Dalvik (2015). dalvik/dalvik-bytecode.html.
  13. Dumitras, T. and Neamtiu, I. (2011). Experimental challenges in cyber security: A story of provenance and lineage for malware. CSET, 11:2011-9.
  14. F-Secure (2015). 996508/1030743/Threat Report H1 2014.pdf.
  15. Farhadi, M. R., Fung, B., Charland, P., and Debbabi, M. (2014). Binclone: detecting code clones in malware. In Software Security and Reliability (SERE), 2014 Eighth International Conference on, pages 78- 87. IEEE.
  16. Gascon, H., Yamaguchi, F., Arp, D., and Rieck, K. (2013). Structural detection of android malware using embedded call graphs. In Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security , AISec 7813, pages 45-54, New York, NY, USA. ACM.
  17. Giannella, C. and Bloedorn, E. (2015). Spectral malware behavior clustering. In Intelligence and Security Informatics (ISI), 2015 IEEE International Conference on, pages 7-12. IEEE.
  18. Goldberg, L. A., Goldberg, P. W., Phillips, C. A., and Sorkin, G. B. (1998). Constructing computer virus phylogenies. Journal of Algorithms, 26(1):188-208.
  19. Hu, X., Chiueh, T.-c., and Shin, K. G. (2009). Large-scale malware indexing using function-call graphs. In Proceedings of the 16th ACM conference on Computer and communications security, pages 611-620. ACM.
  20. Jang, J., Brumley, D., and Venkataraman, S. (2011). Bitshred: feature hashing malware for scalable triage and semantic analysis. In Proceedings of the 18th ACM conference on Computer and communications security, pages 309-320. ACM.
  21. Jilcott, S. (2015). Scalable malware forensics using phylogenetic analysis. In Technologies for Homeland Security (HST), 2015 IEEE International Symposium on, pages 1-6. IEEE.
  22. Karim, M. E., Walenstein, A., Lakhotia, A., and Parida, L. (2005). Malware phylogeny generation using permutations of code. Journal in Computer Virology, 1(1- 2):13-23.
  23. Kinable, J. and Kostakis, O. (2011). Malware classification based on call graph clustering. Journal in computer virology, 7(4):233-245.
  24. Kong, D. and Yan, G. (2014). Transductive malware label propagation: Find your lineage from your neighbors. In INFOCOM, 2014 Proceedings IEEE, pages 1411- 1419. IEEE.
  25. Kruegel, C., Kirda, E., Mutz, D., Robertson, W., and Vigna, G. (2005). Polymorphic worm detection using structural information of executables. In Recent Advances in Intrusion Detection, pages 207-226. Springer.
  26. Ma, J., Dunagan, J., Wang, H. J., Savage, S., and Voelker, G. M. (2006). Finding diversity in remote code injection exploits. In Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, pages 53-64. ACM.
  27. Mercaldo, F., Visaggio, C. A., Canfora, G., and Cimitile, A. (2016). Mobile malware detection in the real world. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, pages 744-746.
  28. Nagra, J. and Collberg, C. (2009). Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Pearson Education.
  29. Rieck, K., Trinius, P., Willems, C., and Holz, T. (2011). Automatic analysis of malware behavior using machine learning. Journal of Computer Security, 19(4):639- 668.
  30. Rosenblum, N., Miller, B. P., and Zhu, X. (2011). Recovering the toolchain provenance of binary code. In Proceedings of the 2011 International Symposium on Software Testing and Analysis, pages 100-110. ACM.
  31. Roy, C. K., Cordy, J. R., and Koschke, R. (2009). Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming, 74(7):470-495.
  32. Schipka, M. (2007). A road to big money: evolution of automation methods in malware development. Martin [17].
  33. Shang, S., Zheng, N., Xu, J., Xu, M., and Zang, H. (2010). Detecting malware variants via function-call graph similarity. In 2010 5th International Conference on Malicious and Unwanted Software (MALWARE), pages 113-120.
  34. Shen, T., Zhongyang, Y., Xin, Z., Mao, B., and Huang, H. (2014). Detect android malware variants using component based topology graph. In IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pages 406-413.
  35. Spreitzenbarth, M., Echtler, F., Schreck, T., Freling, F. C., and Hoffmann, J. (2013). Mobilesandbox: Looking deeper into android applications. In 28th International ACM Symposium on Applied Computing (SAC).
  36. Walenstein, A. and Lakhotia, A. (2012). A transformationbased model of malware derivation. In Malicious and Unwanted Software (MALWARE), 2012 7th International Conference on, pages 17-25. IEEE.
  37. Wehner, S. (2007). Analyzing worms and network traffic using compression. Journal of Computer Security, 15(3):303-320.
  38. Wu, L., Xu, M., Xu, J. Zheng, N., and Zhang, H. (2013). A novel malware variants detection method based on function-call graph. In 13Th IEEE Joint International Computer Science and Information Technology Conference (JICSIT), pages 1-5.
  39. Xiaofang, B., Li, C., Weihua, H., and Qu, W. (2014). Malware variant detection using similarity search over content fringerprint. In 26th Chinese Control and Decision Conference, pages 5334-5339.
  40. Yu, S., Zhou, S., Liu, L., Yang, R., and Luo, J. (2010). Malware variants identification based on byte frequency. In Networks Security Wireless Communications and Trusted Computing (NSWCTC), 2010 Second International Conference on, volume 2, pages 32-35. IEEE.
  41. Zhong, Y., Yamaki, H., and Takakura, H. (2012). A malware classification method based on similarity of function structure. In Applications and the Internet (SAINT), 2012 IEEE/IPSJ 12th International Symposium on, pages 256-261. IEEE.
  42. Zhong, Y., Yamaki, H., Yamaguchi, Y., and Takakura, H. (2013). Ariguma code analyzer: Efficient variant detection by identifying common instruction sequences in malware families. In Computer Software and Applications Conference (COMPSAC), 2013 IEEE 37th Annual, pages 11-20. IEEE.
  43. Zhou, Y. and Jiang, X. (2012). Dissecting android malware: Characterization and evolution. In Proceedings of 33rd IEEE Symposium on Security and Privacy (Oakland 2012).

Paper Citation

in Harvard Style

Canfora G., Mercaldo F., Pirozzi A. and Visaggio C. (2016). How I Met Your Mother? - An Empirical Study about Android Malware Phylogenesis . In Proceedings of the 13th International Joint Conference on e-Business and Telecommunications - Volume 4: SECRYPT, (ICETE 2016) ISBN 978-989-758-196-0, pages 310-317. DOI: 10.5220/0005968103100317

in Bibtex Style

author={Gerardo Canfora and Francesco Mercaldo and Antonio Pirozzi and Corrado Aaron Visaggio},
title={How I Met Your Mother? - An Empirical Study about Android Malware Phylogenesis},
booktitle={Proceedings of the 13th International Joint Conference on e-Business and Telecommunications - Volume 4: SECRYPT, (ICETE 2016)},

in EndNote Style

JO - Proceedings of the 13th International Joint Conference on e-Business and Telecommunications - Volume 4: SECRYPT, (ICETE 2016)
TI - How I Met Your Mother? - An Empirical Study about Android Malware Phylogenesis
SN - 978-989-758-196-0
AU - Canfora G.
AU - Mercaldo F.
AU - Pirozzi A.
AU - Visaggio C.
PY - 2016
SP - 310
EP - 317
DO - 10.5220/0005968103100317