Applying Big Data Based Deep Learning System to Intrusion Detection

Wei Zhong; Ning Yu; Chunyu Ai

doi:10.26599/BDMA.2020.9020003

| Sign up

PDF (5.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Open Access

Applying Big Data Based Deep Learning System to Intrusion Detection

Wei Zhong(), Ning Yu, Chunyu Ai

∙ Division of Math and Computer Science, University of South Carolina Upstate, Spartanburg, SC 29303, USA.

∙ Department of Computing Sciences, State University of New York College at Brockport, Brockport, NY 14420, USA.

Show Author Information

Abstract

With vast amounts of data being generated daily and the ever increasing interconnectivity of the world’s internet infrastructures, a machine learning based Intrusion Detection Systems (IDS) has become a vital component to protect our economic and national security. Previous shallow learning and deep learning strategies adopt the single learning model approach for intrusion detection. The single learning model approach may experience problems to understand increasingly complicated data distribution of intrusion patterns. Particularly, the single deep learning model may not be effective to capture unique patterns from intrusive attacks having a small number of samples. In order to further enhance the performance of machine learning based IDS, we propose the Big Data based Hierarchical Deep Learning System (BDHDLS). BDHDLS utilizes behavioral features and content features to understand both network traffic characteristics and information stored in the payload. Each deep learning model in the BDHDLS concentrates its efforts to learn the unique data distribution in one cluster. This strategy can increase the detection rate of intrusive attacks as compared to the previous single learning model approaches. Based on parallel training strategy and big data techniques, the model construction time of BDHDLS is reduced substantially when multiple machines are deployed.

Keywords

intrusion detection deep learning convolution neural network fully connected feedforward neural network multi-level clustering algorithm

References

[1]

Homeland Security Council, National strategy for homeland security, https://www.dhs.gov/xlibrary/assets/nat_strat_homelandsecurity_2007.pdf, 2007.

[2]

Dua

and X

, Data Mining and Machine Learning in Cybersecurity. Boston, MA, USA: Auerbach Publications, 2011.

[3]

Kim

and M. E.

Aminanto

, Deep learning in intrusion detection perspective: Overview and further challenges, in Proc. 2017 Int. Workshop on Big Data and Information Security (IWBIS), Jakarta, Indonesia, 2017, pp. 5-10.

Crossref

[4]

A. L.

Buczak

and E.

Guven

, A survey of data mining and machine learning methods for cyber security intrusion detection, IEEE Commun. Surv. Tutor., vol. 18, no. 2, pp. 1153-1176, 2016.

Crossref Google Scholar

[5]

C. A.

Catania

and C. G.

Garino

, Automatic network intrusion detection: Current techniques and open issues, Comput. Electr. Eng., vol. 38, no. 5, pp. 1062-1072, 2012.

Crossref Google Scholar

[6]

Litjens

, T.

Kooi

, B. E.

Bejnordi

, A. A. A.

Setio

, F.

Ciompi

, M.

Ghafoorian

, J. A. W. M.

Van Der Laak

, B.

Van Ginneken

, and C. I.

Sánchez

, A survey on deep learning in medical image analysis, Med. Image Anal., vol. 42, pp. 60-88, 2017.

Crossref Google Scholar

[7]

Hodo

, X.

Bellekens

, A.

Hamilton

, C.

Tachtatzis

, and R.

Atkinson

, Shallow and deep networks intrusion detection system: A taxonomy and survey, arXiv preprint arXiv: 1701.02145, 2017.

[8]

Chandra

and R. K.

Sharma

, Deep learning with adaptive learning rate using laplacian score, Exp. Syst. Appl., vol. 63, pp. 1-7, 2016.

Crossref Google Scholar

[9]

Y. C.

, X. Q.

Nie

, and R.

Huang

, Web spam classification method based on deep belief networks, Exp. Syst. Appl., vol. 96, pp. 261-270, 2018.

Crossref Google Scholar

[10]

LeCun

, Y.

Bengio

, and G.

Hinton

, Deep learning, Nature, vol. 521, no. 7553, pp. 436-444, 2015.

Crossref Google Scholar

[11]

Papakostas

and T.

Giannakopoulos

, Speech-music discrimination using deep visual feature extractors, Exp. Syst. Appl., vol. 114, pp. 334-344, 2018.

Crossref Google Scholar

[12]

, J.

Long

, and Z. P.

Cai

, Network intrusion detection through stacking dilated convolutional autoencoders, Secur. Commun. Networks, vol. 2017, p. 4184196, 2017.

Crossref Google Scholar

[13]

T. T. H.

, J.

Kim

, and H.

Kim

, An effective intrusion detection classifier using long short-term memory with gradient descent optimization, in Proc. 2017 Int. Conf. Platform Technology and Service (PlatCon), Busan, South Korea, 2017, pp. 1-6.

Crossref

[14]

A. F. M.

Agarap

, A neural network architecture combining gated recurrent unit (GRU) and support vector machine (SVM) for intrusion detection in network traffic data, in Proc. 10th Int. Conf. Machine Learning and Computing, Macau, China, 2018, pp. 26-30.

Crossref

[15]

Krizhevsky

, I.

Sutskever

, and G. E.

Hinton

, Imagenet classification with deep convolutional neural networks, in Proc. 25th Int. Conf. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012, pp. 1097-1105.

[16]

Shiravi

, H.

Shiravi

, M.

Tavallaee

, and A. A.

Ghorbani

, Toward developing a systematic approach to generate benchmark datasets for intrusion detection, Comput. Secur., vol. 31, no. 3, pp. 357-374, 2012.

Crossref Google Scholar

[17]

Wang

, Y. Q.

Sheng

, J. L.

Wang

, X. W.

Zeng

, X. Z.

, Y. Z.

Huang

, and M.

Zhu

, HAST-IDS: Learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection, IEEE Access, vol. 6, pp. 1792-1806, 2017.

Crossref Google Scholar

[18]

Alpaydm

, Combined 5

\times

2 cv F test for comparing supervised classification learning algorithms, Neural Comput., vol. 11, no. 8, pp. 1885-1892, 1999.

Crossref Google Scholar

[19]

Baldi

, S.

Brunak

, Y.

Chauvin

, C. A. F.

Andersen

, and H.

Nielsen

, Assessing the accuracy of prediction algorithms for classification: An overview, Bioinformatics, vol. 16, no. 5, pp. 412-424, 2000.

Crossref Google Scholar

[20]

Shone

, T. N.

Ngoc

, V. D.

Phai

, and Q.

Shi

, A deep learning approach to network intrusion detection, IEEE Trans. Emerg. Top. Comput. Intell., vol. 2, no. 1, pp. 41-50, 2018.

Crossref Google Scholar

[21]

Fiore

, F.

Palmieri

, A.

Castiglione

, and A.

De Santis

, Network anomaly detection with the restricted boltzmann machine, Neurocomputing, vol. 122, pp. 13-23, 2013.

Crossref Google Scholar

[22]

Schmidhuber

, Deep learning in neural networks: An overview, Neural Networks, vol. 61, pp. 85-117, 2015.

Crossref Google Scholar

[23]

Vinayakumar

, M.

Alazab

, K. P.

Soman

, P.

Poornachandran

, A.

Al-Nemrat

, and S.

Venkatraman

, Deep learning approach for intelligent intrusion detection system, IEEE Access, vol. 7, pp. 41525-41550, 2019.

Crossref Google Scholar

[24]

S. M.

Kasongo

and Y. X.

Sun

, A deep learning method with filter based feature engineering for wireless intrusion detection system, IEEE Access, vol. 7, pp. 38597-38607, 2019.

Crossref Google Scholar

[25]

Nagar

, H. K.

Menaria

, and M.

Tiwari

, Novel approach of intrusion detection classification deeplearning using SVM, presented at First International Conference on Sustainable Technologies for Computational Intelligence, Singapore, 2020, pp. 365-381.

Crossref

[26]

Akter

, G. D.

Dip

, M. S.

Mira

, M. A.

Hamid

, and M.

Mridha

, Construing attacks of internet of things (IoT) and a prehensile intrusion detection system for anomaly detection using deep learning approach, presented at International Conference on Innovative Computing and Communications: Proceedings of ICICC 2019, Singapore, 2020, pp. 427-438.

Crossref

[27]

Z. Q.

Liu

, M. U. D.

Ghulam

, Y.

Zhu

, X. L.

Yan

, L. F.

Wang

, Z. J.

Jiang

, and J. C.

Luo

, Deep learning approach for ids, presented at Fourth International Congress on Information and Communication Technology: ICICT 2019, Singapore, 2020, pp. 471-479.

Crossref

[28]

Sekhar

and K. V.

Rao

, A study: Machine learning and deep learning approaches for intrusion detection system, presented at Int. Conf. Computer Networks and Inventive Communication Technologies, Coimbatore, India, 2019, pp. 845-849.

Crossref

[29]

Nguyen

, S.

Dlugolinsky

, V.

Tran

, and A. L.

García

, Deep learning for proactive network monitoring and security protection, IEEE Access, vol. 8, pp. 19696-19716, 2020.

Crossref Google Scholar

[30]

Abusitta

, M.

Bellaiche

, M.

Dagenais

, and T.

Halabi

, A deep learning approach for proactive multi-cloud cooperative intrusion detection system, Future Generation Comput. Syst., vol. 98, pp. 308-318, 2019.

Crossref Google Scholar

[31]

Liu

and B.

Sun

, An intrusion detection system based on a quantitative model of interaction mode between ports, IEEE Access, vol. 7, pp. 161725-161740, 2019.

Crossref Google Scholar

[32]

Aldwairi

, D.

Perera

, and M. A.

Novotny

, An evaluation of the performance of restricted boltzmann machines as a model for anomaly network intrusion detection, Comput. Networks, vol. 144, pp. 111-119, 2018.

Crossref Google Scholar

[33]

Alliance

, Big data analytics for security intelligence, https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Big_Data_Analytics_for_Security_Intelligence.pdf, 2013.

[34]

Zhong

and F.

, A multi-level deep learning system for malware detection, Exp. Syst. Appl., vol. 133, pp. 151-162, 2019.

Crossref Google Scholar

[35]

J. W.

Han

and M.

Kamber

, Data Mining: Concepts and Techniques. San Francisco, CA, USA: Elsevier, 2011.

[36]

S. K.

Gupta

, K. S.

Rao

, and V.

Bhatnagar

, K-means clustering algorithm for categorical attributes, in Proc. 1st Int. Conf. Data Warehousing and Knowledge Discovery, Berlin, Germany: Springer, 1999, pp. 203-208.

Crossref

[37]

Owen

, R.

Anil

, T.

Dunning

, and E.

Friedman

, Mahout in Action. Shelter Island, NY, USA: Manning Publications, 2011.

[38]

Zhong

, G.

Altun

, R.

Harrison

, P. C.

Tai

, and Y.

Pan

, Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property, IEEE Trans. Nanobioscience, vol. 4, no. 3, pp. 255-265, 2005.

Crossref Google Scholar

[39]

L. D.

Gibert

, Convolutional neural networks for malware classification, Master dissertation, Universitat Politècnica de Catalunya, Tarragona, Spain, 2016.

[40]

Tavallaee

, E.

Bagheri

, W.

, and A. A.

Ghorbani

, A detailed analysis of the KDD CUP 99 data set, in Proc. 2009 IEEE Symp. Computational Intelligence for Security and Defense Applications, Ottawa, Canada, 2009, pp. 1-6.

Crossref

[41]

Song

, H.

Takakura

, and Y.

Okabe

, Description of Kyoto University benchmark data, http://www.takakura.com/Kyoto_data/BenchmarkData-Description-v5.pdf, 2006.

[42]

Lippmann

, R. K.

Cunningham

, D. J.

Fried

, I.

Graf

, K. R.

Kendall

, S. E.

Webster

, and M. A.

Zissman

, Results of the DARPA 1998 offline intrusion detection evaluation, presented at Recent Advances in Intrusion Detection: 4th International Symposium, New York, NY, USA, 1999, pp. 829-835.

[43]

Sharafaldin

, A. H.

Lashkari

, and A. A.

Ghorbani

, Toward generating a new intrusion detection dataset and intrusion traffic characterization, in Proc. 4th Int. Conf. Information Systems Security and Privacy (ICISSP), Funchal, Portugal, 2018, pp. 108-116.

Crossref

[44]

Chen

, A simple utility to classify packets into flows, https://github.com/caesar0301/pkt2flow, 2017.

[45]

M. H.

Bhuyan

, D. K.

Bhattacharyya

, and J. K.

Kalita

, Network anomaly detection: Methods, systems and tools, IEEE Commun. Surv. Tutor., vol. 16, no. 1, pp. 303-336, 2014.

Crossref Google Scholar

Big Data Mining and Analytics

Volume 3 Issue 3,
September 2020

Pages 181-195

DOI: 10.26599/BDMA.2020.9020003

Cite this article:

Zhong W, Yu N, Ai C. Applying Big Data Based Deep Learning System to Intrusion Detection. Big Data Mining and Analytics, 2020, 3(3): 181-195. https://doi.org/10.26599/BDMA.2020.9020003