Elsevier

Future Generation Computer Systems

Volume 115, February 2021, Pages 641-658
Future Generation Computer Systems

Redundancy Coefficient Gradual Up-weighting-based Mutual Information Feature Selection technique for Crypto-ransomware early detection

https://doi.org/10.1016/j.future.2020.10.002Get rights and content

Highlights

  • Insufficient data is a challenge for feature selection in ransomware early detection.

  • The RCGU technique is proposed to overcome such challenge.

  • RCGU makes redundancy–relevancy trade-off to improve feature significance estimation.

  • RCGU is incorporated to the redundancy term of the EMIFS technique.

  • EMIFS is improved by integrating RCGU to MaxMin to prevent redundancy overestimation.

Abstract

Crypto-ransomware is a type of malware whose effect is irreversible even after detection and removal. Thus, early detection is crucial to protect user files from being encrypted and held to ransom. Several studies have proposed early detection solutions based on the data acquired during the pre-encryption phase of the attacks. However, the lack of sufficient data in the early phases of the attack adversely affects the ability of feature selection techniques in these models to perceive the common characteristics of the attack features, which makes it challenging to reduce the redundant features, consequently decreasing the detection accuracy. Therefore, this study proposes a novel Redundancy Coefficient Gradual Upweighting (RCGU) technique that makes better redundancy–relevancy trade-offs during feature selection. Unlike existing feature significance estimation techniques that rely on the comparison between the candidate feature and the common characteristics of the already-selected features, RCGU compares the mutual information between the candidate feature and each feature in the selected set individually. Therefore, RCGU increases the weight of the redundancy term proportional to the number of already selected features. By integrating the RCGU into the Mutual Information Feature Selection (MIFS) technique, the Enhanced MIFS (EMIFS) was developed. Further improvement was achieved by proposing MM-EMIFS which incorporates the MaxMin approximation with EMIFS to prevent the redundancy overestimation that RCGU could cause when the number of features in the already-selected set increases. The experimental evaluation shows that the proposed techniques achieved accuracy higher than that in related works, which confirms the ability of RCGU to make better redundancy–relevancy trade-offs and select more discriminative pre-encryption attack features compared to existing solutions.

Introduction

Since its beginning in the early 1970s, several types of malicious software, also called malware, have been witnessed in the wild, such as Viruses, Worms, Trojans, Spyware and Ransomware [1], [2], [3]. Ransomware is a type of malware whose purpose is to hold user data and files to ransom by denying the access to these files [4], [5], [6], [7], [8], [9]. Although ransomware history dates back to the late 1980s, it did not gain much popularity among attackers until recently, when some enabling technologies like Ransomware-as-a-Service (RaaS), Internet, cryptography and the difficult-to-trace digital currency, have emerged [10]. These technologies make it easy for even novice attackers to develop and disseminate their own ransomware and get paid without the fear of being caught by the authorities [10], [11]. Consequently, the rate of ransomware attacks has increased dramatically in recent years [12], [13], [14].

According to Kaspersky, ransomware attacks are now moving towards business and 30% of infections in 2019 were among corporate users instead of individuals [15]. The report also concluded that around 4$ billion of financial loss was caused by WannaCry attacks. This adds to the previous statistics which show that, throughout the world, the losses hit $3 million and $352 million due to ransomware attacks in 2014 and 2015, respectively [16], [17]. In 2016, Indiana county alone incurred around $220K to recover from ransomware attacks [17]. In 2017, the estimated loss due to NotPetya and WannaCry ransomware attacks was 8$ billion around the globe [18]. Denying access to data is not the only loss that ransomware victims incur, the damage could also include downtime costs, loss of money and reputation [6]. Based on the severity, ransomware is categorized into locker-ransomware and crypto-ransomware [19]. In contrast to locker-ransomware attacks, whose effect can easily be mitigated, crypto-ransomware attacks are not reversible even after removing the malware. In many cases, the victim has no choice other than paying the ransom to get the decryption key [10]. Therefore, to effectively protect user’s digital assets, it is imperative to detect crypto-ransomware attacks early, i.e. before the encryption takes place [10], [16], [20], [21]. The early detection of crypto-ransomware attacks can be achieved by observing its process(es) running in the victim’s machine and analysing the runtime data generated during the pre-encryption phase, i.e. the phase in the crypto-ransomware lifecycle that precedes encryption. However, detecting crypto-ransomware at early phases of its attack is challenging, due to insufficient data and attack patterns at this early phase [19], [22].

The small amount of data captured during the initial phases of the attack is one of the challenges for the early detection which causes low detection accuracy [23], [24]. Even with the availability of many ransomware samples, the runtime data acquired during the pre-encryption phase of the attack is small compared to the entire runtime data that can be collected from each sample if we wait until the end of the attack. This small amount of data contains only a few attack patterns, if any, which are not enough for the model to decide whether this process is normal or malicious. Consequently, the pre-encryption data lack sufficient attack patterns that the detection model needs to make accurate decisions. This data insufficiency also prevents the feature selection technique from identifying the important features that distinguish the ransomware behaviour from the normal behaviour. With the insufficient data collected during the early phases of the attack, the feature selection technique cannot estimate the features’ significance accurately. This challenge exacerbates due to high dimensional features generated by feature extraction methods like n-gram, adopted by most detection solutions [20], [25], [26], [27], [28]. That is, the number of features extracted by n-gram increases exponentially with the size of n, which renders the detection models prone to overfitting [20], [25], [26], [27], [29], [30], [31]. Many of those features are either too common or too specific which makes the information they carry about the attacks of little use [16]. In addition, many of those features are redundant and highly correlated due to the dependency between the API calls used by ransomware’s running process, which makes these APIs always appear together [32], [33], [34]. The redundant features cause a degradation in detection accuracy, as they add no relevant information about the ransomware attack [19]. More importantly, including these redundant features in the selected set comes at the cost of discarding other informative features that the feature selection technique could exclude when exceeding the pre-defined number of required features.

Several Ransomware and malware detection solutions as well as many other solutions incorporate feature selection techniques to reduce data dimensionality and remove redundant features [20], [25], [35], [36]. It turns out that features’ redundancy and relevancy are the main factors that govern the performance of any feature selection technique [37]. These techniques try to filter out the redundant and irrelevant features and keep only the informative ones. However, redundancy and relevancy are not always orthogonal. These features are conflicting in nature, as some relevant features might also be redundant [37]. For example, BCryptDeriveKey employed for deriving the key from secret agreement value is always accompanied by BCryptSecretAgreement responsible for creating hSharedSecret handle used as a parameter for the BCryptDeriveKey. Another example is BCryptEncrypt function used for encrypting a block of data usually comes with BCryptGenerateSymmetricKey, BCryptGenerateKeyPair, or BCryptImportKey employed to obtain the hKey handle which is used as an input parameter for BCryptEncrypt. Therefore, the redundancy–relevancy trade-off is needed during the selection process. As such, it is necessary that the feature selection technique can make this redundancy–relevancy trade-off effectively.

The information theory-based feature selection techniques are superior when it comes to the trade-off between redundancy and relevancy, as they make no assumptions about the distribution of the underlying data [29], [38]. This is important for ransomware early detection, as it relies on sparse and incomplete attack patterns whose clear distribution has yet to be observed [19]. The redundancy–relevancy trade-off is carried out by adjusting the values of redundancy coefficients, which changes the belief in the redundancy term at each iteration in a way that is inversely proportional to the current size of the selected features set [38]. Although this approach works well for data with full observations about the attacks, it generates a suboptimal feature set when dealing with data that lack sufficient attack patterns [39], [40]. This is due to the reliance on the calculation of mutual information between the candidate feature and the common characteristics of all already-selected features in the selected set [29]. Such common characteristics are difficult to perceive from incomplete data acquired during the pre-encryption phase of crypto-ransomware attacks. Consequently, the selected set could include redundant and irrelevant features, given the limited amount of attack patterns, as is the case in the early detection where the entire characteristics of ransomware attack have not yet been observed [39], [41]. Therefore, an improvement to the mutual information technique is needed that overcomes the challenge of pre-encryption data insufficiency and estimates features’ significance more accurately.

To this end, this paper is devoted to address this issue and proposes a Redundancy Coefficient Gradual Upweighting (RCGU) technique that estimates the features significance accurately even with insufficient attack patterns, as is the case in the early (pre-encryption) phase of crypto-ransomware attacks. By incorporating the proposed RCGU into the feature selection technique, the redundancy between the candidate feature and each feature in the selected set is individually calculated at every iteration of the feature selection process. Unlike existing feature significance techniques that decrease the weight of the redundancy term in the goal function when the number of features in the already-selected set increases, the proposed RCGU proportionally increases the weight of the redundancy term when the number of those features increases.

The key idea is that, instead of comparing the characteristics of the candidate feature with the common characteristics of all features in the selected set (which is very difficult to perceive from the limited amount of pre-encryption data collected at the beginning of a ransomware attack), RCGU (individually) compares between the candidate feature and each feature in the already-selected set. This individual comparison will help to discover redundancy even with the insufficient runtime data collected during the early phases of ransomware attacks. The intuition is that, by comparing the candidate feature with each feature in the selected set individually, the chance that the candidate feature is redundant with one or more of those features increases with the growth of the selected set’s size [40]. With this approach, the need to extract the difficult-to-perceive common characteristics of the features in the already-selected set becomes unnecessary. Consequently, the belief in the redundancy term increases when more features are added to the selected set. As such, the proposed RCGU can make better redundancy–relevancy trade-off when dealing with limited amount of data as it is the case of the data collected during the pre-encryption phase of crypto-ransomware attacks’ lifecycle. The contribution of this paper is four-fold.

  • 1-

    A Redundancy Coefficient Gradual Up-weighting (RCGU) technique is proposed and incorporated into the redundancy term of the goal function of the mutual information feature selection technique to improve the calculation of the relevancy–redundancy trade-off, which in turns helps in selecting a more informative features set.

  • 2-

    RCGU is incorporated with the Maximum of Minimum (MaxMin) approximation technique to prevent the redundancy overestimation that RCGU could cause when the size of the selected set increases.

  • 3-

    We have shown that the redundancy term plays a major role in the accuracy of the selected features and is better than the involvement of conditional redundancy in the calculation.

  • 4-

    An extensive experimental evaluation was conducted to show the efficacy and significance of the improvement that the proposed techniques contributed to.

For the purpose of this study, crypto-ransomware and ransomware are used interchangeably unless stated otherwise. The rest of this paper is organized as follows. Section 2 gives an overview of the related work. Section 3 provides preliminaries about the mutual information-based feature selection techniques. Section 4 details the methodology followed to design and develop the proposed techniques. Section 5, presents the experimental results, which are analysed and discussed and compared with related works. The paper concludes with a summary of the methods and results as well as suggestions for future work, in Section 6.

Section snippets

Related works

Like other cyberattacks, ransomware attacks target a variety of systems and networks, including Personal Computers (PCs), mobile devices, Wireless Sensor Networks (WSN), Vehicular Ad-Hoc Networks (VANETs), and the Internet of Things (IoT) [42], [43], [44], [45], [46]. Several studies have been conducted to detect crypto-ransomware attacks. These studies can be categorized into data-centric and process-centric approaches. The data-centric approach monitors the user data and files subject to

Preliminaries

For two discrete variables, the mutual information (MI) criterion is the amount of information that these variables share about each other [39]. This criterion is calculated according to Eq. (1) as follows. IX;Y=yϵYxϵXp(x,y)logp(x,y)p(x)p(y)where I(X;Y) denotes the mutual information between X and Y, Px and p(y) are the marginal distribution of x and y and p(x,y) is the joint distribution. According to Li, et al. [29], Brown, et al. [38], Eq. (2) represents the general formula of the

The methodology

In this section, the design of the proposed RCGU technique that accurately estimates the features’ significance based on the data collected during the pre-encryption phase of crypto- ransomware is elaborated. In addition, the integration of RCGU with the Mutual Information Feature Selection (MIFS) technique is detailed. This integration takes place in the redundancy term of the goal function. Moreover, the performance of the proposed RCGU is improved by incorporating the MaxMin approximation

Results analysis and discussion

This section describes the implementation and experimental evaluation of the proposed EMIFS and MM-EMIFS techniques. It starts with an explanation of the dataset used by this study. The experimental results of each technique and the comparison with the related works are then presented and discussed.

Conclusion

In this paper, the Redundancy Coefficient Gradual Upweighting (RCGU) technique was proposed for better estimation of the significance of a crypto-ransomware feature when the amount of data is insufficient during the early phases of the attack lifecycle. The proposed RCGU was integrated into the goal function of the Mutual Information Features Selection and two improved feature selection techniques for early detection of crypto-ransomware were proposed: EMIFS and MM-EMIFS. The integration of the

CRediT authorship contribution statement

Bander Ali Saleh Al-rimy: Conceptualization, Methodology, Data curation, Writing - original draft, Software, Visualization, Investigation, Writing- review & editing, Formal analysis. Mohd Aizaini Maarof: Supervision. Mamoun Alazab: Writing - review & editing, Visualization. Syed Zainudeen Mohd Shaid: Supervision. Fuad A. Ghaleb: Validation, Conceptualization, Investigation. Abdulmohsen Almalawi: Project administration. Abdullah Marish Ali: Funding acquisition. Tawfik Al-Hadhrami: Writing -

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This project was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, Saudi Arabia , under grant No. (DF-274-611-1441). The authors, therefore, gratefully acknowledge DSR technical and financial support.

Bander Ali Saleh Al-rimy is a senior lecturer at UNITAR International University. He received the B.Sc. degree in computer engineering from the Faculty of Engineering, Sana’a University, Yemen, in 2003, the M.Sc. degree in information technology from OUM, Malaysia, in 2013, and the Ph.D. degree in computer science (information security) from the Faculty of Engineering, Universiti Teknologi Malaysia (UTM), in 2019. His research interests include but not limited to Malware, IDS, IoT, network

References (79)

  • StiborekJ. et al.

    Multiple instance learning for malware classification

    Expert Syst. Appl.

    (2018)
  • FallahpourS. et al.

    Using an ensemble classifier based on sequential floating forward selection for financial distress prediction problem

    J. Retailing Consum. Serv.

    (2017)
  • ReinekingT.

    Active classification using belief functions and information gain maximization

    Internat. J. Approx. Reason.

    (2016)
  • NissimN. et al.

    Trusted system-calls analysis methodology aimed at detection of compromised virtual machines using sequential mining

    Knowl. Based Syst.

    (2018)
  • WangY. et al.

    An efficient semi-supervised representatives feature selection algorithm based on information theory

    Pattern Recognit.

    (2017)
  • LiuH. et al.

    Feature selection with dynamic mutual information

    Pattern Recognit.

    (2009)
  • CheJ. et al.

    Maximum relevance minimum common redundancy feature selection for nonlinear data

    Inform. Sci.

    (2017)
  • BennasarM. et al.

    Feature selection using joint mutual information maximisation

    Expert Syst. Appl.

    (2015)
  • BenzaidC. et al.

    Fast authentication in wireless sensor networks

    Future Gener. Comput. Syst.

    (2016)
  • MoratoD. et al.

    Ransomware early detection by the analysis of file sharing traffic

    J. Netw. Comput. Appl.

    (2018)
  • AhmedY.A. et al.

    A system call refinement-based enhanced minimum redundancy maximum relevance method for ransomware early detection

    J. Netw. Comput. Appl.

    (2020)
  • BidokiS.M. et al.

    Pbmmd: A novel policy based multi-process malware detection

    Eng. Appl. Artif. Intell.

    (2017)
  • HamptonN. et al.

    Ransomware behavioural analysis on windows platforms

    J. Inf. Secur. Appl.

    (2018)
  • VasanD. et al.

    Image-based malware classification using ensemble of CNN architectures (IMCEC)

    Comput. Secur.

    (2020)
  • ZhangH.Q. et al.

    Classification of ransomware families with machine learning based on N-gram of opcodes

    Future Gener. Comput. Syst. Int. J. Esci.

    (2019)
  • ZimbaA. et al.

    Multi-stage crypto ransomware attacks: A new emerging cyber threat to critical infrastructure and industrial control systems

    Ict Express

    (2018)
  • HomayounS.

    DRTHIS: Deep ransomware threat hunting and intelligence system at the fog layer

    Future Gener. Comput. Syst.

    (2019)
  • AzabA. et al.

    Mining malware to detect variants

  • ChenJ. et al.

    Uncovering the face of android ransomware: Characterization and real-time detection

    IEEE Trans. Inf. Forensics Secur.

    (2018)
  • AzmoodehA. et al.

    Detecting crypto-ransomware in IoT networks based on energy consumption footprint

    J. Ambient Intell. Humaniz. Comput.

    (2017)
  • YalewS.D. et al.

    Hail to the thief: Protecting data from mobile ransomware with ransomsafedroid

  • EtaherN. et al.

    From ZeuS to zitmo: Trends in banking malware

  • R. Moussaileb, B. Bouget, A. Palisse, H. Le Bouder, N. Cuppens, J.L. Lanet, Ransomware’s early mitigation mechanisms,...
  • A. Kharraz, S. Arshad, C. Mulliner, W. Robertson, E. Kirda, UNVEIL: A large-scale automated approach to detecting...
  • A. Kharraz, W. Robertson, D. Balzarotti, L. Bilge, E. Kirda, Cutting the gordian knot: A look under the hood of...
  • KasperskyC.

    Ransomware 2018–2020

    (2018)
  • HomayounS. et al.

    Know abnormal find evil: Frequent pattern mining for ransomware threat hunting and intelligence

    IEEE Trans. Emerg. Top. Comput.

    (2017)
  • BerruetaE. et al.

    A survey on detection techniques for cryptographic ransomware

    IEEE Access

    (2019)
  • SgandurraD. et al.

    Automated dynamic analysis of ransomware: Benefits, limitations and use for detection

    (2016)
  • Cited by (0)

    Bander Ali Saleh Al-rimy is a senior lecturer at UNITAR International University. He received the B.Sc. degree in computer engineering from the Faculty of Engineering, Sana’a University, Yemen, in 2003, the M.Sc. degree in information technology from OUM, Malaysia, in 2013, and the Ph.D. degree in computer science (information security) from the Faculty of Engineering, Universiti Teknologi Malaysia (UTM), in 2019. His research interests include but not limited to Malware, IDS, IoT, network security, and routing technologies. Dr. Al-Rimy was a recipient of several academic awards and recognitions including UTM Alumni Award, UTM Best Student Award, UTM Merit Award, UTM Excellence Award, OUM Distinction Award, and the Best Research Paper Award.

    Mohd Aizaini Maarof received the B.Sc. degree in computer science from Western Michigan University, Kalamazoo, MI, USA, the M.Sc. degree in computer science from Central Michigan University, Mount Pleasant, MI, USA, and the Ph.D. degree in IT security from Aston University, Birmingham, U.K. He is currently a Professor with the School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia (UTM). He is also the head of UTM-CSM Cyber Threat Intelligence Lab (CTIL) and a member of the Information Assurance and Security Research Group (IASRG), UTM. His research interest includes information system security.

    Mamoun Alazab is an Associate Professor in the College of Engineering, IT and Environment, Charles Darwin University, Australia. He is a cyber security researcher and practitioner with industry and academic experience. His research is multidisciplinary that focuses on cyber security and digital forensics of computer systems including current and emerging issues in the cyber environment like cyber–physical systems and internet of things, by taking into consideration the unique challenges present in these environments, with a focus on cybercrime detection and prevention. A/Prof Alazab received his Ph.D. degree in Computer Science and has more than 100 research papers. He presented at many invited keynotes talks and panels, at conferences and venues nationally and internationally (22 events in 2018 alone). He is a Senior Member of the IEEE. He is an editor on multiple editorial boards including Associate Editor of IEEE Access and Editor of the Security and Communication Networks Journal.

    Syed Zainudeen Mohd Shaid  is a lecturer at Universiti Teknologi Malaysia (UTM) where he teaches subjects like Penetration Testing, Security Programming, Exploitation and other security related subjects. He received his B.Sc, M.Sc, and Ph.D. (Computer Science) from Universiti Teknologi Malaysia (UTM). A member of the Information Assurance & Security Research Group (IASRG), he is active in Malware Research. He also does training and consultancy on Web Security, Secure Coding, Android, and embedded systems. He loves gadgets and enjoys exploring new things related to security

    Fuad A. Ghaleb received the B.Sc. degree in computer engineering from the Faculty of Engineering, Sana’a University, Yemen, in 2003, and the M.Sc. and Ph.D. degrees in computer science (information security) from the School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia (UTM), Johor, Malaysia, in 2014 and 2018, respectively. From 2004 to 2012, he was a Lecturer of network and computer engineering with the Sana’a Community College, Yemen. He is involved in different projects with industries related to network and information security. His research interests include vehicular network security, cyber security, intrusion detection, data science, data mining, and artificial intelligence.

    Dr. Ghaleb was a recipient of many awards and recognitions, such as the Postdoctoral Fellowship Award from UTM, the Best Postgraduate Student Award from UTM, the Excellence Awards from UTM, and the Best Presenter Award from the School of Computing, Faculty of Engineering, UTM, as well as Best Paper Awards from many international conferences

    Abdulmohsen Almalawi received the BS degree in computer science from King Abdul Aziz University, Jeddah, Saudi Arabia, in 2003. He received the MS and PhD degrees in computer science from RMIT University, Melbourne, Australia, in 2009 and 2014, respectively. He is an assistant professor in the School of Computer Science and IT, King Abdul Aziz University, Jeddah, Saudi Arabia. His research interests are intrusion detection and cybersecurity of industrial SCADA systems with emphasis on data mining, machine learning, and fast algorithms.

    Abdullah Marish Ali is an assistant professor in King Abdul Aziz University, Jeddah, Saudi Arabia. He received his Ph.D. degree in Computer Science from University Teknologi Malaysia, Malaysia in 2018. His research interests are in the areas of machine learning, machine learning, Data Mining, and IoT

    Tawfik Al-Hadhrami received the M.Sc. degree in IT/applied system engineering from Heriot-Watt University, Edinburgh, U.K., the Ph.D. degree in wireless mesh communication from the University of the West of Scotland, Glasgow, U.K., 2015. He was involved in research at the University of the West of Scotland, Networking Group. He is currently a Senior Lecturer with Nottingham Trent University (NTU), U.K, where he is also a member of the Network Infrastructure and Cyber Security (NICS) Group. His research interests include the Internet of Things (IoT) and applications, network infrastructures and emerging technologies, artificial intelligence, computational intelligence, and 5G wireless communications. He is involved in different projects with industries. He is an Associate Editor of IEEE ACCESS and the IEEE SENSORS JOURNALS.

    View full text