GRU-based deep learning approach for network intrusion alert prediction

https://doi.org/10.1016/j.future.2021.09.040Get rights and content

Highlights

  • Prediction of alerts from malicious sources, as opposed to attacks against a target.

  • Prediction of both categorical and non-categorical parameters in the alert.

  • Existing works either predict category or the probability of future event(s).

  • Proposal tested on a large dataset with approximately 260 million alerts.

Abstract

The exponential growth in the number of cyber attacks in the recent past has necessitated active research on network intrusion detection, prediction and mitigation systems. While there are numerous solutions available for intrusion detection, the prediction of future network intrusions still remains an open research problem. Existing approaches employ statistical and/or shallow machine learning methods for the task, and therefore suffer from the need for feature selection and engineering. This paper presents a deep learning based approach for prediction of network intrusion alerts. A Gated Recurrent Unit (GRU) based deep learning model is proposed which is shown to be capable of learning dependencies in security alert sequences, and to output likely future alerts given a past history of alerts from an attacking source. The performance of the model is evaluated on intrusion alert sequences obtained from the Warden alert sharing platform.

Introduction

In light of the ever increasing network attacks (both in number as well as intensity), it is necessary to continuously monitor and analyze the enormous volumes of data transmitted around the globe. This has led to the development of a variety of network monitoring and analysis systems, including intrusion detection systems (IDS), honeypots, network flow monitors, etc. Such systems are widely deployed in today networks where they help administrators to deal with various attacks and intrusions. Sometimes, data from these systems are also shared among multiple organizations using various data-sharing platforms [1], [2], [3], which allows for more proactive solutions (e.g. blocking the most dangerous attackers seen in other networks) rather reactive ones. In recent years, the availability of large amounts of information about detected cyber attacks, as well as recent advances in machine learning, allowed researches to focus not only on detection, but also on prediction of cyber attacks [4], [5], [6].

Several predictive methods in the area of cybersecurity has been proposed in recent years [7], [8], [9], [10], and while they are good proofs of concept showing that predicting future attacks is possible, they still have very limited capabilities and, therefore, limited use in practice. For example, they only allow to predict the expected number of detected attacks in a future time interval [7], the most probable next step of an already ongoing multi-stage attack [8], or just the probability there will be some attack originating from a given source [9].

In this paper we propose a novel deep-learning based method for prediction of network attacks coming from a known malicious source, which is capable to predict not only the probability of an attack observation, but rather predicts concrete parameters (e.g. its type, intensity and target) of the expected attack, which enables better defense measures to be applied.

Most of the previous works focus on prediction of future attacks (or rather future steps of a complex attack) against a single target. However, as works on predictive blacklisting [11], [12] as well as a recent work by Bartos et al. [9] showed, it is also useful to predict future behavior of previously identified malicious sources. Such a view can be especially useful in connection to various alert sharing platforms, which are being increasingly used in the last years [1], [13], [14], since it allows to leverage information about attacks against different targets to predict future ones. This is also backed by our own experience as an operator of a nation-level network and a multi-organization alert sharing community (more in Section 3). Our task here is not to protect a single network (which is a common task of most other practitioners), but to gather information on attacks and attackers observed in multiple networks, analyze them, and warn end-networks about imminent threats. Prediction of future actions of an attacker is one of the most important goals here.

For example, it is often not possible to blacklist all IP addresses previously reported as a source of some attack, either for technical reasons (e.g. a limited number of firewall rules) or due to a high risk of false positives. Information about the expected behavior of each potentially malicious source, like the probability of further attacks, their type, severity or target, can help to set up better blacklists or other defensive measures (e.g. rate limiting, limit on number of login attempts). A score derived from predicted future behavior of individual IP addresses can also be used to filter traffic during DDoS attacks [15]. Furthermore, information about the predicted continuation or repetition of detected attacks can be very helpful in alert prioritization algorithms which help human operators to decide which alerts to solve first.

A model presented in [9] only predicts the probability of observing future attacks coming from a malicious source in a certain time interval. It is done by using shallow learning and manually engineered features. In this work, we use the more advanced, deep-learning approach — neural network with Gated Recurrent Units (GRU) [16], which is a model able to effectively learn long term dependencies in sequence data. Unlike shallow learners, the learning process for deep learners does not depend on human-crafted (derived) features [17]. This can be attributed to the fact that deep learning has the potential to extract better representations of high-level data (basic features) which is made possible by the inherent complex architecture of the network as well as the possibility of inclusion of non-linear transformations [18], [19]. As a result, our GRU-based model is able to take a sequence of previous alerts with just a little preprocessing and predict concrete properties of the next few alerts, like their category, volume, target IPs or approximate time. This allows to apply defensive measures very precisely.

The contributions of the paper can be listed as follows:

  • Prediction of alerts from malicious sources, as opposed to the alert prediction against a given target usually performed in the available literature.

  • Prediction of both categorical (e.g. protocol, attack category) and non-categorical (e.g. time of attack, volume) fields in the alert, while most previous works either predict category of future event or probability of a certain event occurring in the future.

This paper is organized as follows. Section 2 discusses pertinent works available in the technical literature. Section 3 contains a description of the data used in this work. Section 4 presents the objectives and goal definition. Section 5 contains the details of data preparation (pre-processing) for making it amenable for the deep learning model. Section 6 presents the details of the proposed deep learning network for alert prediction, with a subsection dealing with the DL model parameter selection and tuning. Section 7 describes the error metric we used to evaluate the prediction model. Results of the evaluation are presented in Section 8. Section 9 highlights some comparisons between the proposed solution and similar existing works. Section 10 mentions the limitations and avenues for future improvements. Section 11 contains concluding remarks.

Section snippets

Existing works

Predictive methods in cyber security can be divided into three main areas by their use case [5]: (i) attack projection and intention recognition, (ii) intrusion prediction and (iii) network security situation forecasting. The task of the methods in the first area is to predict what is an attacker (in an already observed attack) going to do next and what is its ultimate goal. A survey of earlier attack projection methods can be found in [20]. Later examples include [10], [21], [22], which use

Alert data

Alert data used in the paper is obtained from the threat sharing platform Warden1 (also known as SABU platform2) – an alert sharing tool and community run by CESNET, Czech National Research and Education Network (NERN) [52]. It is an open-source platform designed for automatic sharing of detected security events amongst Cyber Security Incident Response Teams (CSIRT). Every month, Warden receives (and redistributes) millions of alerts

Objective

The proposed method aims to predict parameters of future alerts related to a given attack source based on information about previous alerts. Since the alert trail from each malicious source is in the form of a sequence, the problem is modeled as a sequence prediction problem.

Recurrent Neural Network architectures have been demonstrated to be suitable for such tasks [54]. Although theoretically capable, conventional RNNs struggle when there are long-term dependencies in the data. LSTMs and GRUs

Data pre-processing

Although it is certainly desirable to have an end-to-end learning system, it is infeasible to directly feed the complex IDEA-formatted alerts from the Warden to the deep learning model. Appropriate data transformation and pre-processing are therefore imperative for the model to be effectively employed. These steps are shown in Fig. 2. First, only the entries corresponding to the chosen alert tuple are retained in the dataset and all other columns are discarded. Thereafter, the issue of missing

Proposed GRU-based deep learning network for alert prediction

The proposed deep neural network for alert prediction is shown in Fig. 3(a). Training, validation and hyperparameter tuning operations are shown enclosed in the top (red) dashed box, and the test (prediction) operations are shown enclosed in the bottom (blue) dashed box. The details of the model are shown in Fig. 3(b) from where it can be observed that a 3-layer sequential model with GRU layers stacked on the top of a Dense layer is used. Details of the process of parameter selection for the

Estimation of prediction accuracy

The quality of predictions is estimated by comparing predicted alerts for all the test vectors with the corresponding ground truth alerts. The dissimilarity metric between actual and predicted alerts (error value) is computed as a weighted sum of dissimilarities of individual fields. The comparison method for each field is explained below, and is also depicted in Fig. 5.

Categorical Fields: The categorical fields viz. Category, Port, and Protocol are directly compared for the predicted and

Results

The prediction performance of the proposed GRU-based deep neural network was evaluated for the following scenario: Training on May–July data, testing on August data. As explained in Section 6.1, the following set of parameters was identified as the best suited for the task: Batch Size: 2048; No. of GRU Units: 512; Epochs: 500; Loss parameter: Mean Absolute Error (MAE); Optimizer: Adam with AMSGrad = True; Patience (Keras parameter, for Early Stopping): 10.

Comparison with similar works

This section presents a discussion on comparisons between the proposed approach and related existing works. Early research efforts for prediction of security events mostly culminated in the assignment of probability scores to the next possible security events or alerts, and then identifying the most probable attack scenario [48], [49], [50], [51]. The DL method presented in this work however predicts the entire alert, and therefore cannot be compared directly with the methods in [48], [49], [50]

Limitations and future work

As like with any research attempt, the proposed method is not without its limitations and avenues for further improvements. The identified shortcomings may be attributed to different factors and are discussed below.

Limitations of data: Since the proposed approach is tested on the alert dataset from Warden only, it would be interesting to test its performance on security event data from other similar sources. Further, the present study considered data from SourceIPs which had at least 40 alerts

Conclusion

This paper presented a GRU-based deep learning approach for alert prediction. Network intrusion alerts provided by multiple detection systems via a sharing system called Warden were used to train the deep learning model, and subsequently predict future alerts originating from malicious sources. The proposed approach is different from the existing works in the literature in the sense that most of the available works either perform classification on the incoming alerts and assign a label: benign

CRediT authorship contribution statement

Mohammad Samar Ansari: Conceptualization, Implementation. Václav Bartoš: Implementation, Data collection. Brian Lee: Conceptualization, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by (i) European Union’s Horizon 2020 Research and Innovation Program, PROTECTIVE, under Grant Agreement No. 700071, and (ii) European Union’s Horizon 2020 research and innovation program under grant agreement No. 833418.

Mohammad Samar Ansari received B. Tech., M. Tech. and Ph.D. degrees in Electronics Engineering from Aligarh Muslim University, Aligarh, India, in 2001, 2007 and 2012, respectively. He is presently working as an Associate Professor in the Department of Electronics Engineering, Aligarh Muslim University, India. Prior to this, he was a Post-Doctoral Research Fellow in the Software Research Institute, Athlone Institute of Technology, Ireland, while being on leave from the position of Assistant

References (59)

  • VasilomanolakisE. et al.

    Taxonomy and survey of collaborative intrusion detection

    ACM Comput. Surv.

    (2015)
  • SunN. et al.

    Data-driven cybersecurity incident prediction: A survey

    IEEE Commun. Surv. Tutor.

    (2019)
  • HusákM. et al.

    Survey of attack projection, prediction, and forecasting in cyber security

    IEEE Commun. Surv. Tutor.

    (2019)
  • HusákM. et al.

    Predictive methods in cyber defense: Current experience and research challenges

    Future Gener. Comput. Syst.

    (2021)
  • SokolP. et al.

    Prediction of attacks against honeynet based on time series modeling

  • ShenY. et al.

    Tiresias: Predicting security events through deep learning

  • HusákM. et al.

    Predictive cyber situational awareness and personalized blacklisting: A sequential rule mining approach

    ACM Trans. Manage. Inf. Syst.

    (2020)
  • SoldoF. et al.

    Blacklisting recommendation system: Using spatio-temporal patterns to predict future attacks

    IEEE J. Sel. Areas Commun.

    (2011)
  • C. Sauerwein, C. Sillaber, A. Mussmann, R. Breu, Threat intelligence sharing platforms: An exploratory study of...
  • Third Annual Study on Exchanging Cyber Threat Intelligence: There Has to Be a Better WayResearch Report

    (2018)
  • JanskyT. et al.

    Augmented DDoS mitigation with reputation scores

  • ChoK. et al.

    Learning phrase representations using RNN encoder-decoder for statistical machine translation

    (2014)
  • LeeB. et al.

    Comparative study of deep learning models for network intrusion detection

    SMU Data Sci. Rev.

    (2018)
  • KimG. et al.

    LSTM-based system-call language modeling and robust ensemble method for designing host-based intrusion detection systems

    (2016)
  • YinC. et al.

    A deep learning approach for intrusion detection using recurrent neural networks

    IEEE Access

    (2017)
  • YangS.J. et al.

    Attack projection

  • HusákM. et al.

    AIDA framework: Real-time correlation and prediction of intrusion detection alerts

  • ZhangK. et al.

    Online mining intrusion patterns from IDS alerts

    Appl. Sci.

    (2020)
  • KatipallyR. et al.

    Attacker behavior analysis in multi-stage attack detection system

  • Cited by (0)

    Mohammad Samar Ansari received B. Tech., M. Tech. and Ph.D. degrees in Electronics Engineering from Aligarh Muslim University, Aligarh, India, in 2001, 2007 and 2012, respectively. He is presently working as an Associate Professor in the Department of Electronics Engineering, Aligarh Muslim University, India. Prior to this, he was a Post-Doctoral Research Fellow in the Software Research Institute, Athlone Institute of Technology, Ireland, while being on leave from the position of Assistant Professor in the Department of Electronics Engineering, AMU, Aligarh. Prior to joining Aligarh Muslim University as a Lecturer, he has been associated with Siemens, Defence Research and Development Organization (DRDO) and Malaviya National Institute of Technology, Jaipur. His research interests include neural networks, machine learning, and data privacy. He has published around 110 research papers in reputed international journals & conferences and authored 2 books and contributed 3 book chapters. He is also the recipient of the prestigious Young Faculty Research Fellowship by Department of Electronics and IT, Ministry of Communications and Information Technology, Govt. of India.

    Václav Bartoš is a network security researcher and data analyst at CESNET, operator of Czech academic network. He got his Ph.D. at Brno University of Technology in 2019, where he also received his M.Sc. in 2011. His research interests are network traffic analysis, detection of security incidents and post-processing of the detected events. He participated in several national and European projects, such as PROTECTIVE or GÉANT GN-4.

    Brian Lee received his Ph.D. from Trinity College Dublin. He worked in the Telecommunications industry for many years in network management research development. He is currently the Director of the Software Research Institute at Athlone IT. His research interests are centered on the broad theme of ‘responsive infrastructures’ across the areas of computer security and networking.

    View full text