With vast amounts of data being generated daily and the ever increasing interconnectivity of the world’s internet infrastructures, a machine learning based Intrusion Detection Systems (IDS) has become a vital component to protect our economic and national security. Previous shallow learning and deep learning strategies adopt the single learning model approach for intrusion detection. The single learning model approach may experience problems to understand increasingly complicated data distribution of intrusion patterns. Particularly, the single deep learning model may not be effective to capture unique patterns from intrusive attacks having a small number of samples. In order to further enhance the performance of machine learning based IDS, we propose the Big Data based Hierarchical Deep Learning System (BDHDLS). BDHDLS utilizes behavioral features and content features to understand both network traffic characteristics and information stored in the payload. Each deep learning model in the BDHDLS concentrates its efforts to learn the unique data distribution in one cluster. This strategy can increase the detection rate of intrusive attacks as compared to the previous single learning model approaches. Based on parallel training strategy and big data techniques, the model construction time of BDHDLS is reduced substantially when multiple machines are deployed.
- Article type
- Year
- Co-author
Biological network alignment is an important research topic in the field of bioinformatics. Nowadays almost every existing alignment method is designed to solve the deterministic biological network alignment problem. However, it is worth noting that interactions in biological networks, like many other processes in the biological realm, are probabilistic events. Therefore, more accurate and better results can be obtained if biological networks are characterized by probabilistic graphs. This probabilistic information, however, increases difficulties in analyzing networks and only few methods can handle the probabilistic information. Therefore, in this paper, an improved Probabilistic Biological Network Alignment (PBNA) is proposed. Based on IsoRank, PBNA is able to use the probabilistic information. Furthermore, PBNA takes advantages of Contributor and Probability Generating Function (PGF) to improve the accuracy of node similarity value and reduce the computational complexity of random variables in similarity matrix. Experimental results on dataset of the Protein-Protein Interaction (PPI) networks provided by Todor demonstrate that PBNA can produce some alignment results that ignored by the deterministic methods, and produce more biologically meaningful alignment results than IsoRank does in most of the cases based on the Gene Ontology Consistency (GOC) measure. Compared with Prob method, which is designed exactly to solve the probabilistic alignment problem, PBNA can obtain more biologically meaningful mappings in less time.