Abstract
Ransomware is a self-propagating malware encrypting file systems of the compromised computers to extort victims for financial gains. Hundreds of schools, hospitals, and local government municipalities have been disrupted by ransomware that already caused 12.1 days of system downtime on average (Siegel 2019). This study aims at developing a deep learning-based detector DeepRan for ransomware early detection and classification to prevent network-wide data encryption. DeepRan applies an attention-based bi-directional Long Short Term Memory (BiLSTM) with a fully connected (FC) layer to model normalcy of hosts in an operational enterprise system and detects abnormal activity from a large volume of ambient host logging data collected from bare metal servers. DeepRan also classifies abnormal activity as one of the candidate ransomware attacks by extending attention-based BiLSTM with a Conditional Random Fields (CRF) model. The Term Frequency-Inverse Document Frequency (TF-IDF) method is applied to extract semantic information from high dimensional host logging data. An incremental learning technique is used to extend the model’s existing knowledge to prevent DeepRan quality degradation over time. We develop a testbed of bare metal servers and collect normal host logs of two users for 63 days (IRB-approved). 17 ransomware attacks are executed on the victim hosts, and the infected host logging data is used for validating DeepRan. Experimental results present that DeepRan produces 99.87% detection accuracy (F1-score of 99.02%) for ransomware early detection. The detector also achieves 96.5% accuracy to classify abnormal events as one of 17 candidate ransomware families. The application of incremental learning is validated as an efficient technique to enhance model quality over time.
Similar content being viewed by others
References
Siegel, B. (2019). Ransomware payments rise as public sector is targeted, new variants enter the market, 11. https://securityboulevard.com/2019/11/ransomware-payments-rise-as-public-sector-is-targeted-new-variants-enter-the-market/.
Challita, A. (2018). The four most popular methods hackers use to spread ransomware, 08. https://www.itproportal.com/features/the-four-most-popular-methods-hackers-use-to-spread-ransomware/.
Dobran, B. (2019). 27 terrifying ransomware statistics & facts you need to read, 01. https://phoenixnap.com/blog/ransomware-statistics-facts.
Davis, J. (2019). 71% of ransomware attacks targeted small businesses in 2018, 03. https://healthitsecurity.com/news/71-of-ransomware-attacks-targeted-small-businesses-in-2018.
Shi, F. (2019). Threat spotlight: Government ransomware attacks, 08. https://blog.barracuda.com/2019/08/28/threat-spotlight-government-ransomware-attacks/.
Cook, S. (2019). 2017-2019 ransomware statistics and facts, 06. https://www.comparitech.com/antivirus/ransomware-statistics/.
Andone, D. (2019). 3 alabama hospitals are accepting patients again after a ransomware attack on its computers, 10. https://www.cnn.com/2019/10/11/us/alabama-hospital-ransomware-attack/index.html.
Cimpanu, C. (2019). Second florida city pays giant ransom to ransomware gang in a week, 06. https://www.zdnet.com/article/second-florida-city-pays-giant-ransom-to-ransomware-gang-in-a-week/.
Bridges, R.A., Iannacone, M.D., Goodall, J.R., & Beaver, J.M. (2018). How do information security workers use host data? a summary of interviews with security analysts, arXiv:1812.02867.
Turcotte, M.J., Kent, A.D., & Hash, C. (2017). Unified host and network data set, ArXiv e-prints.
Windows Logging Service. (2020). https://www.federallabs.org/successes/success-stories/windows-logging-service.
Campus, K.C.N.S. (2017). Windows logging service, 08. https://kcnsc.doe.gov/docs/default-source/kcnsc-software/windows-logging-service-summary_073117.pdf?sfvrsn= 26b745c4_2.
Olbrich, C. (2016). A close look at ransomware by the example of vipasana, 10. https://www.boxcryptor.com/en/blog/post/a-close-look-at-ransomware-vipasana-part-i/.
Meskauskas, T. (2019). Xorist ransomware removal instructions, 4. https://www.pcrisk.com/removal-guides/9905-xorist-ransomware.
Detailed technical analysis of xorist ransomware (ransomware report), 05. (2018). https://www.howtoremoveit.info/technical-analysis-report-xorist-ransomware/.
Teslacrypt ransomware attacks. (2020). https://usa.kaspersky.com/resource-center/threats/teslacrypt.
INTELLIGENCE, D.S.C.T.U.T. (2015). Teslacrypt ransomware, 05. https://www.secureworks.com/research/teslacrypt-ransomware-threat-analysis.
Meskauskas, T. (2017). Fantom ransomware, 06. https://www.pcrisk.com/removal-guides/10418-fantom-ransomware.
Morparia, J. (2016). Ransom.jigsaw, 08. https://www.symantec.com/security-center/writeup/2016-041123-3256-99.
Kiguolis, L. (2019). Remove jigsaw ransomware / virus (removal instructions), 11. https://www.2-spyware.com/remove-jigsaw-ransomware-virus.html.
Perlroth, N., & Boeing possibly hit by ’wannacry’ malware attack. (2018). [Online] Available: https://www.nytimes.com/2018/03/28/technology/boeing-wannacry-malware.html.
Team, S.R. (2017). Petya ransomware outbreak: Here’s what you need to know, 10. https://www.symantec.com/blogs/threat-intelligence/petya-ransomware-wiper.
Meskauskas, T. (2018). Goldeneye ransomware removal instructions, 07. https://www.pcrisk.com/removal-guides/10733-goldeneye-ransomware.
Sponchioni, R. (2016). Ransom.goldeneye, 07. https://www.symantec.com/security-center/writeup/2016-120715-1834-99.
Labs, M. (2017). Goldeneye ransomware – the petya/mischa combo rebranded, 07. https://blog.malwarebytes.com/threat-analysis/2016/12/goldeneye-ransomware-the-petyamischa-combo-rebranded/.
Anand Ajjan, D.P. (2018). Btcware ransomware, 04. https://www.sophos.com/en-us/medialibrary/PDFs/factsheets/sophos-btcware-ransomware-wpna.pdf.
Threat Spotlight. (2017). Defray Ransomeware Hits Healthcare and Education. (accessed 16 Aug 2018). [Online]. Available: https://threatvector.cylance.com/en_us/home/threat-spotlight-defray-ransomware-hits-healthcare-and-education.html.
Crowe, J. (2017). Alert: Defray ransomware launching extremely personalized attacks, 08. https://blog.barkly.com/defray-ransomware-highly-targeted-campaigns.
This Ransomware Demands Nude instead of Bitcoin - Motherboard. (2017). (accessed 24 Aug 2018). [Online]. Available: https://motherboard.vice.com/en_us/article/yw3w47/this-ransomware-demands-nudes-instead-of-bitcoin.
Arghire, I. (2018). ’redeye’ ransomware destroys files, rewrites mbr, 6. https://www.securityweek.com/redeye-ransomware-destroys-files-rewrites-mbr.
Abrams, L. (2018). New saturn ransomware actively infecting victims, 2. https://www.bleepingcomputer.com/news/security/new-saturn-ransomware-actively-infecting-victims/.
McAfee. (2019). Threat landscape dashboard scarab - ransomware, 06. https://www.mcafee.com/enterprise/en-us/threat-center/threat-landscape-dashboard/ransomware-details.scarab-ransomware.html.
Hioureas, V. (2018). Scarab ransomware: new variant changes tactics, 1. https://blog.malwarebytes.com/threat-analysis/2018/01/scarab-ransomware-new-variant-changes-tactics/.
Salvio, J. (2018). Gandcrab v4.0 analysis: New shell, same old menace. https://www.fortinet.com/blog/threat-research/gandcrab-v4-0-analysis--new-shell--same-old-menace.html.
Mundo, A., & Gandcrab ransomware puts the pinch on victims, 07. (2018). https://securingtomorrow.mcafee.com/mcafee-labs/gandcrab-ransomware-puts-the-pinch-on-victims/.
Itay Cohen, B.H. (2018). Ryuk ransomware: A targeted campaign break-down, 08. https://research.checkpoint.com/ryuk-ransomware-targeted-campaign-break/.
Infoblox. (2019). Ryuk ransomware cyber report. https://www.infoblox.com/wp-content/uploads/threat-intelligence-report-ryuk-ransomeware-cyber-report.pdf.
Fakterman, T. (2019). Sodinokibi: The crown prince of ransomware, 08. https://www.cybereason.com/blog/the-sodinokibi-ransomware-attack.
Nocturnus, C. (2019). Sodinokibi: The crown prince of ransomware 08. https://www.cybereason.com/blog/the-sodinokibi-ransomware-attacks.
Brown, A., Tuor, A., Hutchinson, B., & Nichols, N. (2018). Recurrent neural network attention mechanisms for interpretable system log anomaly detection. In Proceedings of the First Workshop on Machine Learning for Computing Systems. ACM, pp. 1.
Chen, T., Xu, R., He, Y., & Wang, X. (2017). Improving sentiment analysis via sentence type classification using bilstm-crf and cnn. Expert Systems with Applications, 72, 221–230.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735–1780.
Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional lstm-crf models for sequence tagging, arXiv:1508.01991.
Zhang, X., Xu, Y., Lin, Q., Qiao, B., Zhang, H., Dang, Y., Xie, C., Yang, X., Cheng, Q., Li, Z., & et al. (2019). Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, pp. 807–817.
Ma, X., & Hovy, E. (2016). End-to-end sequence labeling via bi-directional lstm-cnns-crf, arXiv:1603.01354.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513–523.
Chen, Q., & Bridges, R.A. (2017). Automated behavioral analysis of malware: A case study of wannacry ransomware. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, pp. 454–460.
Chen, Q., Islam, S.R., Haswell, H., & Bridges, R.A. (2019). Automated ransomware behavior analysis: Pattern extraction and early detection. In International Conference on Science of Cyber Security, pp. 199–214. Berlin: Springer.
FOG Project. (2020). A free open-source network computer cloning and management solution, https://fogproject.org/.
Pytorch: From research to production. (2020). https://pytorch.org/.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.
Schalkoff, R.J. (1997). Artificial neural networks. McGraw-Hill Higher Education.
Fernandez Maimo, L., Huertas Celdran, A., Perales Gomez, A.L., Clemente, G., Félix, J., Weimer, J., & Lee, I. (2019). Intelligent and dynamic ransomware spread detection and mitigation in integrated clinical environments. Sensors, 19(5), 1114.
Homayoun, S., Dehghantanha, A., Ahmadzadeh, M., Hashemi, S., & Khayami, R. (2017). Know abnormal, find evil: frequent pattern mining for ransomware threat hunting and intelligence, IEEE transactions on emerging topics in computing.
Takeuchi, Y., Sakai, K., & Fukumoto, S. (2018). Detecting ransomware using support vector machines. In Proceedings of the 47th International Conference on Parallel Processing Companion. ACM, pp. 1.
Bridges, R.A., Glass-Vanderlan, T.R., Iannacone, M.D., Vincent, M.S., & Chen, Q. (2019). A survey of intrusion detection systems leveraging host data. ACM Computing Surveys (CSUR), 52(6), 1–35.
Lou, J.G., Fu, Q., Yang, S., Xu, Y., & Li, J. (2010). Mining invariants from console logs for system problem detection. In USENIX Annual Technical Conference. pp. 23–25.
Xu, W., Huang, L., Fox, A., Patterson, D., & Jordan, M.I. (2009). Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, pp. 117–132.
Liang, Y., Zhang, Y., Xiong, H., & Sahoo, R. (2007). Failure prediction in ibm bluegene/l event logs. In Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, pp. 583– 588.
Zhang, K., Xu, J., Min, M.R., Jiang, G., Pelechrinis, K., & Zhang, H. (2016). Automated it system failure prediction: A deep learning approach. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, pp.1291–1300.
Ahmadian, M. M., & Shahriari, H. R. (2016). 2entfox: A framework for high survivable ransomwares detection. In 2016 13th International Iranian Society of Cryptology Conference on Information Security and Cryptology (ISCISC). IEEE, pp. 79–84.
Kharaz, A., Arshad, S., Mulliner, C., Robertson, W., & Kirda, E. (2016). {UNVEIL},: A large-scale, automated approach to detecting ransomware. In 25th {USENIX} Security Symposium ({USENIX} Security 16), pp. 757–772.
Lee, J.K., Moon, S.Y., & Park, J.H. (2017). Cloudrps: a cloud analysis based enhanced ransomware prevention system. The Journal of Supercomputing, 73(7), 3065–3084.
Verma, M.E., & Bridges, R.A. (2018). Defining a metric space of host logs and operational use cases. In 2018 IEEE International Conference on Big Data (Big Data), pp. 5068–5077.
Morato, D., Berrueta, E., Magañaa, E., & Izal, M. (2018). Ransomware early detection by the analysis of file sharing traffic. Journal of Network and Computer Applications, 124, 14–32.
Hardy, W., Chen, L., Hou, S., Ye, Y., & Li, X. (2016). Dl4md: a deep learning framework for intelligent malware detection. In Proceedings of the International Conference on Data Mining (DMIN). The Steering Committee of The World Congress in Computer Science, Computer, p. 61.
Homayoun, S., Dehghantanha, A., Ahmadzadeh, M., Hashemi, S., Khayami, R., Choo, K.K.R., & Newton, D.E. (2019). Drthis: Deep ransomware threat hunting and intelligence system at the fog layer. Future Generation Computer Systems, 90, 94–104.
Rhode, M., Burnap, P., & Jones, K. (2018). Early-stage malware prediction using recurrent neural networks. Computers & Security, 77, 578–594.
Acknowledgments
This research is sponsored by the National Science Foundation under Grant No.1812599. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Roy, K.C., Chen, Q. DeepRan: Attention-based BiLSTM and CRF for Ransomware Early Detection and Classification. Inf Syst Front 23, 299–315 (2021). https://doi.org/10.1007/s10796-020-10017-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-020-10017-4