Skip to main content
Log in

Extracting rules for vulnerabilities detection with static metrics using machine learning

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

Software quality is the prime solicitude in software engineering and vulnerability is one of the major threat in this respect. Vulnerability hampers the security of the software and also impairs the quality of the software. In this paper, we have conducted experimental research on evaluating the utility of machine learning algorithms to detect the vulnerabilities. To execute this experiment; a set of software metrics was extracted using machine learning in the form of easily accessible laws. Here, 32 supervised machine learning algorithms have been considered for 3 most occurred vulnerabilities namely: Lawofdemeter, BeanMemberShouldSerialize,and LocalVariablecouldBeFinal in a software system. Using the J48 machine learning algorithm in this research, up to 96% of accurate result in vulnerability detection was achieved. The results are validated against tenfold cross validation and also, the statistical parameters like ROC curve, Kappa statistics; Recall, Precision, etc. have been used for analyzing the result.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Alves H, Fonseca B, Antunes N (2016) Experimenting machine learning techniques to predict vulnerabilities. In: 2016 Seventh Latin-American symposium on dependable computing (LADC), pp 151–156. IEEE

  • Bhatt N, Anand A, Yadavalli VSS, Kumar V (2017) Modeling and characterizing software vulnerabilities. Int J Math Eng Manag Sci 2(4):288–299

    Google Scholar 

  • Chowdhury I, Zulkernine M (2011) Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. J Syst Archit 57(3):294–313

    Article  Google Scholar 

  • Di Penta M, Cerulo L, Aversano L (2009) The life and death of statically detected vulnerabilities: an empirical study. Inf Softw Technol 51(10):1469–1484

    Article  Google Scholar 

  • Fontana FA, Mäntylä MV, Zanoni M, Marino A (2016) Comparing and experimenting machine learning techniques for code smell detection. Empir Softw Eng 21(3):1143–1191

    Article  Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  • Islam MR, Zibran MF (2016) A comparative study on vulnerabilities in categories of clones and non-cloned code. In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol 3, pp 8–14. IEEE

  • Kansal Y, Kumar D, Kapur PK (2016) Vulnerability patch modeling. Int J Reliab Qual Saf Eng 23:1640013

    Article  Google Scholar 

  • Kapur PK, Sachdeva N, Khatri SK (2015) Vulnerability discovery modeling. In: International conference on quality, reliability, infocom technology and industrial technology management, pp 34–54

  • Kindy DA, Pathan A-SK (2011) A survey on SQL injection: vulnerabilities, attacks, and prevention techniques. In: 2011 IEEE 15th international symposium on consumer electronics (ISCE), pp 468–471. IEEE

  • Kumar S, Pal SK, Singh R (2016) Intelligent energy conservation: indoor temperature forecasting with extreme learning machine. In: Proceedings of intelligent systems technologies and applications 2016, vol 2, pp 977–988. Springer Switzerland

  • Kumar S, Pal SK, Singh R (2018) A novel method based on extreme learning machine to predict heating and cooling load through design and structural attributes. Energy Build 176:275–286

    Article  Google Scholar 

  • Kumar S, Pal SK, Singh R (2019) A novel hybrid model based on particle swarm optimisation and extreme learning machine for short-term temperature prediction using ambient sensors. Sustain Cities Soc 49:101601

    Article  Google Scholar 

  • Love BC (2002) Comparing supervised and unsupervised category learning. Psychon Bull Rev 9:829–835

    Article  Google Scholar 

  • Mahmood R, Mahmoud QH (2018) Evaluation of static analysis tools for finding vulunerbailities in java and c/c++ source code. arXiv preprint arXiv:1805.09040

  • Medeiros I, Neves NF, Correia M (2014) Automatic detection and correction of web application vulnerabilities using data mining to predict false positives. In: Proceedings of the 23rd international conference on world wide web, pp 63–74. ACM

  • Miguel MA (2018) Vulnerabilities reached a historic peak. https://www.welivesecurity.com/2018/02/05/vulnerabilities-reached-historic-peak-2017/

  • Moser A, Kruegel C, Kirda E (2007) Limits of static analysis for malware detection. In: Twenty-third annual computer security applications conference (ACSAC 2007), pp 421–430. IEEE

  • Nasa C, Suman S (2012) Evaluation of different classification techniques for web data. Int J Comput Appl 52(9):34–40

    Google Scholar 

  • Russell R, Kim L, Hamilton L, Lazovich T, Harer J, Ozdemir O, Ellingwood P, McConley M (2018) Automated vulnerability detection in source code using deep representation learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp 757–762. IEEE

  • Rutar N, Almazan CB, Foster JS (2004) A comparison of bug finding tools for java. In: 15th International symposium on software reliability engineering, pp 245–256. IEEE

  • Scandariato R, Walden J, Hovsepyan A, Joosen W (2014) Predicting vulnerable software components via text mining. IEEE Trans Softw Eng 40(10):993–1006

    Article  Google Scholar 

  • Shar LK, Tan HBK (2012) Predicting common web application vulnerabilities from input validation and sanitization code patterns. In 2012 Proceedings of the 27th IEEE/ACM international conference on automated software engineering, pp 310–313. IEEE

  • Shar LK, Briand LC, Tan HBK (2015) Web application vulnerability prediction using hybrid program analysis and machine learning. IEEE Trans Dependable Secure Comput 12(6):688–707

    Article  Google Scholar 

  • Shin Y, Meneely A, Williams L, Osborne JA (2011) Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans Softw Eng 37(6):772–787

    Article  Google Scholar 

  • Willmott CJ, Ackleson SG, Davis RE, Feddema JJ, Klink KM, Legates DR, O’donnell J, Rowe CM (1985) Statistics for the evaluation and comparison of models. J Geophys Res Oceans 90:8995–9005

    Article  Google Scholar 

  • Winkler I, Gomes AT (2017) Chapter 5-how to hack computers. Adv Persistent Secur J

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijay Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gupta, A., Suri, B., Kumar, V. et al. Extracting rules for vulnerabilities detection with static metrics using machine learning. Int J Syst Assur Eng Manag 12, 65–76 (2021). https://doi.org/10.1007/s13198-020-01036-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-020-01036-0

Keywords

Navigation