AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Feature Selection for Malware Detection on the Android Platform Based on Differences of IDF Values

Department of Computer Engineering, Yaşar University, İzmir 35100, Turkey
Department of Computer Science, Dokuz Eylül University, İzmir 35390, Turkey
Department of Software Engineering, Yaşar University, İzmir 35100, Turkey
Show Author Information

Abstract

Android is the mobile operating system most frequently targeted by malware in the smartphone ecosystem, with a market share significantly higher than its competitors and a much larger total number of applications. Detection of malware before being published on official or unofficial application markets is critically important due to the typical end users’ widespread security inadequacy. In this paper, a novel feature selection method is proposed along with an Android malware detection approach. The feature selection method proposed in this study makes use of permissions, API calls, and strings as features, which are statically extractable from the Android executables (APK files) and it can be used in a machine learning process with different algorithms to detect malware on the Android platform. A novel document frequencybased approach, namely Delta IDF, was designed and implemented for feature selection. Delta IDF was tested upon three universal benchmark datasets that contain Android malware samples and highly promising results were obtained by using several binary classification algorithms.

Electronic Supplementary Material

Download File(s)
jcst-35-4-946-Highlights.pdf (522.3 KB)

References

[1]
Zhauniarovich Y, Gadyatskaya O. Small changes, big changes: An updated view on the Android permission system. In Proc. the 19th International Symposium on Research in Attacks, Intrusions, and Defenses, September 2016, pp.346-367.
[2]

Zhao Z, Wang J, Bai J. Malware detection method based on the control-flow construct feature of software. IET Information Security, 2014, 8(1): 18-24.

[3]
Arp D, Spreitzenbarth M, Hübner M, Gascon H, Rieck K, Siemens C. DREBIN: Effective and explainable detection of Android malware in your pocket. In Proc. the 21st Annual Network and Distributed System Security Symposium, February 2014.
[4]

Wang X, Zhang D, Su X, Li W. Mlifdect: Android malware detection based on parallel machine learning and information fusion. Security and Communication Networks, 2017, 2017: Article No. 6451260.

[5]
Zhou Y, Jiang X. Dissecting Android malware: Characterization and evolution. In Proc. the 33rd IEEE Symposium on Security and Privacy, May 2012, pp.95-109.
[6]
Yerima S Y, Sezer S, Muttik I. Android malware detection using parallel machine learning classifiers. In Proc. the 8th International Conference on Next Generation Mobile Apps, Services and Technologies, September 2014, pp.37-42.
[7]
Alatwi H A, Oh T, Fokoué E, Stackpole B. Android malware detection using category-based machine learning classifiers. In Proc. the 17th Annual Conference on Information Technology Education, September 2016, pp.54-59.
[8]
Coronado-De-Alba L D, Rodríguez-Mota A, Escamilla-Ambrosio P J. Feature selection and ensemble of classifiers for Android malware detection. In Proc. the 8th IEEE Latin-American Conference on Communications, November 2016.
[9]

Karbab E B, Debbabi M, Derhab A, Mouheb D. MalDozer: Automatic framework for Android malware detection using deep learning. Digital Investigation, 2018, 24: S48-S59.

[10]

Firdaus A, Anuar N, Faizal M et al. Bio-inspired computational paradigm for feature investigation and malware detection: Interactive analytics. Multimedia Tools and Applications, 2018, 77(14): 17519-17555.

[11]

Milosevic N, Dehghantanha A, Choo K K R. Machine learning aided Android malware classification. Computers & Electrical Engineering, 2017, 61: 266-274.

[12]

Lin C T, Wang N J, Xiao H, Eckert C. Feature selection and extraction for malware classification. Journal of Information Science and Engineering, 2015, 31(3): 965-992.

[13]
Suarez-Tangil G, Stringhini G. Eight years of rider measurement in the Android malware ecosystem: Evolution and lessons learned. arXiv: 1801.08115. https://arxiv.org/abs/1801.08115, July 2020.
[14]

Suarez-Tangil G, Tapiador J E, Peris-Lopez P, Alís B J. Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families. Expert Systems with Applications, 2014, 41(4): 1104-1117.

[15]
Lindorfer M, Neugschwandtner M, Platzer C. MARVIN: Efficient and comprehensive mobile app classification through static and dynamic analysis. In Proc. the 39th IEEE Annual Computer Software and Applications Conference, July 2015, pp.422-433.
[16]

Pektaş A, Acarman T. Malware classification based on API calls and behaviour analysis. IET Information Security, 2018, 12(2): 107-117.

[17]
Aafer Y, Du W, Yin H. DroidAPIMiner: Mining APIlevel features for robust malware detection in Android. In Proc. the 9th International ICST Conference on Security and Privacy in Communication Networks, September 2013, pp.86-103.
[18]
Mariconti E, Onwuzurike L, Andriotis P, de Cristofaro E, Ross G, Stringhini G. MaMaDroid: Detecting Android malware by building Markov chains of behavioral models. arXiv: 1612.04433, 2016. https://arxiv.org/abs/1612.04433, April 2018.
[19]
Onwuzurike L, Almeida M, Mariconti E, Blackburn J, Stringhini G, de Cristofaro E. A family of droids: Analyzing behavioral model based Android malware detection via static and dynamic analysis. arXiv: 1803.03448, 2018. https://arxiv.org/abs/1803.03448, October 2019.
[20]

Hu D, Ma Z, Zhang X, Li P, Ye D, Ling B. The concept drift problem in Android malware detection and its solution. Security and Communication Networks, 2017, 2017: Article No. 4956386.

[21]
Zhang X, Hu D, Fan Y, Yu K. A novel Android malware detection method based on Markov blanket. In Proc. the 1st IEEE International Conference on Data Science in Cyberspace, June 2016, pp.347-352.
[22]

Chen J, Alalfi M H, Dean T R et al. Detecting Android malware using clone detection. Journal of Computer Science and Technology, 2015, 30(5): 942-956.

[23]

Li L, Li D, Bissyande T F et al. On locating malicious code in piggybacked Android Apps. Journal of Computer Science and Technology, 2017, 32(6): 1108-1124.

[24]

Lei T, Qin Z, Wang Z, Li Q, Ye D. EveDroid: Event-aware Android malware detection against model degrading for IoT devices. IEEE Internet of Things Journal, 2019, 6(4): 6668-6680.

[25]
Peiravian N, Zhu X. Machine learning for Android malware detection using permission and API calls. In Proc. the 25th IEEE International Conference on Tools with Artificial Intelligence, November 2013, pp.300-305.
[26]

Wu S, Wang P, Li X, Zhang Y. Effective detection of Android malware based on the usage of data flow APIs and machine learning. Information and Software Technology, 2016, 75: 17-25.

[27]
Allix K, Bissyandé T F, Klein J, le Traon Y. AndroZoo: Collecting millions of Android Apps for the research community. In Proc. the 13th IEEE/ACM Working Conference on Mining Software Repositories, May 2016, pp.468-471.
[28]
Felt A P, Ha E, Egelman S, Haney A, Chin E, Wagner D. Android permissions: User attention, comprehension, and behavior. In Proc. the 8th Symposium on Usable Privacy and Security, July 2012, Article No. 3.
[29]
Felt A P, Chin E, Hanna S, Song D, Wagner D. Android permissions demystified. In Proc. the 18th ACM Conference on Computer and Communications Security, October 2011, pp.627-637.
[30]
Firdausi I, Lim C, Erwin A, Nugroho A S. Analysis of machine learning techniques used in behavior-based malware detection. In Proc. the 2nd International Conference on Advances in Computing, Control and Telecommunication Technologies, December 2010, pp.201-203.
[31]
Moskovitch R, Feher C, Tzachar N, Berger E, Gitelman M, Dolev S, Elovici Y. Unknown malcode detection using OPCODE representation. In Proc. the 1st European Conference on Intelligence and Security Informatics, December 2008, pp.204-215.
[32]

Quinlan J R. Induction of decision trees. Machine Learning, 1986, 1(1): 81-106.

[33]
Witten I H, Frank E. Data Mining: Practical Machine Learning Tools and Techniques (2nd edition). Morgan Kaufmann, 2005.
[34]
Larose D T. Discovering Knowledge in Data: An Introduction to Data Mining (2nd edition). Wiley, 2004.
[35]

Breiman L. Random forests. Machine Learning, 2001, 45(1): 5-32.

[36]
Stefanowski J. The rough set based rule induction technique for classification problems. In Proc. the 6th European Congress on Intelligent Techniques and Soft Computing, September 1998, pp.109-113.
[37]

Salzberg S. A nearest hyperrectangle learning method. Machine Learning, 1991, 6(3): 251-276.

[38]

Cendrowska J. PRISM: An algorithm for inducing modular rules. International Journal of Man-Machine Studies, 1987, 27(4): 349-370.

[39]

Holte R C. Very simple classification rules perform well on most commonly used datasets. Machine Learning, 1993, 11: 63-91.

[40]
Platt J C. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods, Schölkopf B, Christopher J C B, Smola A J (eds.), MIT Press, 1999, pp.185-208.
Journal of Computer Science and Technology
Pages 946-962
Cite this article:
Peynirci G, Eminağaoğlu M, Karabulut K. Feature Selection for Malware Detection on the Android Platform Based on Differences of IDF Values. Journal of Computer Science and Technology, 2020, 35(4): 946-962. https://doi.org/10.1007/s11390-020-9323-x

419

Views

6

Crossref

N/A

Web of Science

11

Scopus

4

CSCD

Altmetrics

Received: 21 December 2018
Revised: 13 March 2020
Published: 27 July 2020
©Institute of Computing Technology, Chinese Academy of Sciences 2020
Return