The effect of Bellwether analysis on software vulnerability severity prediction models

Kudjo, Patrick Kwaku; Chen, Jinfu; Mensah, Solomon; Amankwah, Richard; Kudjo, Christopher

doi:10.1007/s11219-019-09490-1

The effect of Bellwether analysis on software vulnerability severity prediction models

Published: 07 January 2020

Volume 28, pages 1413–1446, (2020)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Patrick Kwaku Kudjo ORCID: orcid.org/0000-0002-8145-6530¹,
Jinfu Chen¹,
Solomon Mensah²,
Richard Amankwah¹ &
…
Christopher Kudjo³

744 Accesses
14 Citations
Explore all metrics

Abstract

Vulnerability severity prediction (VSP) models provide useful insight for vulnerability prioritization and software maintenance. Previous studies have proposed a variety of machine learning algorithms as an important paradigm for VSP. However, to the best of our knowledge, there are no other existing research studies focusing on investigating how a subset of features can be used to improve VSP. To address this deficiency, this paper presents a general framework for VSP using the Bellwether analysis (i.e., exemplary data). First, we apply the natural language processing techniques to the textual descriptions of software vulnerability. Next, we developed an algorithm termed Bellvul to identify and select an exemplary subset of data (referred to as Bellwether) to be considered as the training set to yield improved prediction accuracy against the growing portfolio, within-project cases, and the k-fold cross-validation subset. Finally, we assessed the performance of four machine learning algorithms, namely, deep neural network, logistic regression, k-nearest neighbor, and random forest using the sampled instances. The prediction results of the suggested models and the benchmark techniques were assessed based on the standard classification evaluation metrics such as precision, recall, and F-measure. The experimental result shows that the Bellwether approach achieves F-measure ranging from 14.3% to 97.8%, which is an improvement over the benchmark techniques. In conclusion, the proposed approach is a promising research direction for assisting software engineers when seeking to predict instances of vulnerability records that demand much attention prior to software release.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

Notes

http://www.netmarketshare.com
tiny.cc/o1cbez
Bad guys, an entity with special programming skills who uses his or her technical skills to gain unauthorized access to computer systems

References

Amasaki, A. S., & Lokan, C. (2016). Towards better selection between moving windows and growing portfolio. In Proceedings of the 17th International Conference Product-Focused Software Process Improvement: PROFES (vol. 17, pp. 627–630).
Arora, A., Krishnan, R., Telang, R., et al. (2010). An empirical analysis of software vendors’ patch release behavior: Impact of vulnerability disclosure. Information Systems Research, 21, 115–132.
Article Google Scholar
Baldwin, L. (2018). Research concepts for the practitioner of educational leadership. Brill | Sense.
Bayoud, H. A. (2019). Tests of normality: new test and comparative study. Communications in Statistics-Simulation and Computation, 1–22.
Belsley, D. A. (1991). A guide to using the collinearity diagnostics. Computer Science in Economics and Management, 4(1), 33–50.
MathSciNet MATH Google Scholar
Brankovic, A., Falsone, A., Prandini, M., & Piroddi, L. (2018). A feature selection and classification algorithm based on randomized extraction of model populations. IEEE Transactions on Cybernetics, 48(4), 1151–1162.
Article Google Scholar
Chandra, P. (2017). Investment analysis and portfolio management, McGraw-hill education.
Chen, B. C., Ramakrishnan, R., Shavlik, J. W., et al. (2006). Bellwether analysis: predicting global aggregates from local regions. In Proceedings of the 32nd International Conference on Very Large Databases (pp. 655–666).
Chen, B. C., Ramakrishnan, R., Shavlik, J. W., et al. (2009a). Bellwether analysis : searching for cost-effective query-defined predictors in large databases. In ACM Transactions on Knowledge Discovery from Data (TKDD) (vol. 3, p. 5).
Chen, B. C., Ramakrishnan, R., Shavlik, J. W., et al. (2009b). Bellwether analysis. ACM Transactions on Knowledge Discovery from Data, 3(1), 1–49.
Article Google Scholar
Cheng, P., Wang, L., Jajodia, S. et al. (2012) Aggregating CVSS base scores for semantics-rich network security metrics. In Proceedings of the 31st IEEE Symposium on Reliable Distributed Systems (SRDS) (pp. 31–40).
Cook, R. D. (2000). Detection of influential observation in linear regression. Technometrics, 42(1), 65–68.
Article MathSciNet Google Scholar
Debole, F., & Sebastiani, F. (2003). Supervised term weighting for automated text categorization. Proceedings of the ACM Symposium on Applied Computing, 784–788.
Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use Welch’s t-test instead of student’s t-test. International Review of Social Psychology, 30(1), 92–101.
Article Google Scholar
Derrick, B., Ruck, A., Toher, D., et al. (2018). Tests for equality of variances between two samples which contain both paired observations and independent observations. Journal of Applied Quantitative Methods, 13(2), 36–47.
Google Scholar
Dhillon, I. S., Mallela, S., & Kumar, R. (2003). A divisive information theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research, 1265–1287.
Feutrill, A., Ranathunga, D., Yarom, Y., et al. (2018). The effect of common vulnerability scoring system metrics on vulnerability exploit delay. In Proceedings of the 6th IEEE International Symposium on Computing and Networking (CANDAR) (pp. 1–10).
Frost, & Sullivan. Vulnerability research market analysis. https://www.techrepublic.com [Online: accessed 22-April-2019].
Gastwirth, J. L., Gel, Y. R., & Miao, W. (2009). The impact of Levene’s test of equality of variances on statistical theory and practice. Statistical Science, 24(3), 343–360.
Article MathSciNet MATH Google Scholar
Gu, Q., Zhu, L., Cai, Z. (2009). Evaluation measures of the classification performance of imbalanced data sets. In International Symposium on Intelligence Computation and Applications (pp. 461–471).
Hagan, M. T., & Menhaj, M. B. (1994). Training feedforward networks with the marquardt algorithm. IEEE Transactions on Neural Networks, 5(6), 989–993.
Article Google Scholar
Han, Z., Li, X., Xing, Z., et al. (2017). Learning to predict severity of software vulnerability using only vulnerability description. In IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 125–136).
Holm, H., Ekstedt, M., & Andersson, D. (2012). Empirical analysis of system-level vulnerability metrics through actual attacks. IEEE Transactions on Dependable and Secure Computing, 9(6), 825–837.
Article Google Scholar
Houmb, S. H., & Franqueira, V. N. (2009). Estimating ToE risk level using CVSS. In International Conference on Availability, Reliability, and Security, ARES’09 (pp. 718–725).
Huang, S. Tang, H. Zhang, M., Tian, J. (2010). Text clustering on national vulnerability database,” in Proceedings of the 2nd International Conference on Computer Engineering and Applications (ICCEA) (pp. 295–299).
Jimenez, M., Papadakis, M., Le Traon, Y. (2016). Vulnerability prediction models: a case study on the Linux kernel. In Proceedings of the 16th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM) (pp. 1–10).
Khazaei, A., Ghasemzadeh, M., & Derhami, V. (2016). An automatic method for CVSS score prediction using vulnerabilities description. Journal of Intelligent & Fuzzy Systems, 30(1), 89–96.
Article Google Scholar
Khoshgoftaar, T. M., Golawala, M., Van Hulse, J. (2007). An empirical study of learning from imbalanced data using random forest. In Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI (pp. 310–317).
Kim, T. H., & White, H. (2004). On more robust estimation of skewness and kurtosis. Finance Research Letters, 1(1), 56–73.
Article Google Scholar
Kitchenham, B., Madeyski, L., Budgen, D., et al. (2017). Robust statistical methods for empirical software engineering. Empirical Software Engineering, 22, 579–630.
Article Google Scholar
Kondo, M., Bezemer, C. P., Kamei, Y., et al. (2019). The impact of feature reduction techniques on defect prediction models. Empirical Software Engineering, 1–39.
Krishna, R., & Menzies, T. (2018). Bellwethers: a baseline method for transfer learning. In IEEE Transactions on Software Engineering (p. 1).
Krishna, R., Menzies, T., Fu, W. (2016). Too much automation? The bellwether effect and its implications for transfer learning. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (pp. 122–131).
Kuwahara, E. (2006). Torts v. contracts: can Microsoft be held liable to home consumers for its security flaws. Southern California Law Review, 80, 997.
Google Scholar
Lamkanfi, A., Demeyer, S., Soetens, Q. D., et al. (2011). Comparing mining algorithms for predicting the severity of a reported bug. In Proceedings of the 15th IEEE European Conference on Software Maintenance and Reengineering, (CSMR), pp. 249–258.
Last, D. (2015). Using historical software vulnerability data to forecast future vulnerabilities. In Resilience Week (RWS), IEEE (pp. 1–7).
Li, X., Chen, J., Lin, Z., et al. (2017). A mining approach to obtain the software vulnerability characteristics,” in Proceedings of the 5th IEEE International Conference on Advanced Cloud and Big Data (vol. 1, pp. 2–7).
Liu, Q., & Zhang, Y. (2011). VRSS: A new system for rating and scoring vulnerabilities. Computer Communications, 34, 264–273.
Article Google Scholar
Liu, Q., Zhang, Y., Kong, Y., et al. (2012). Improving VRSS-based vulnerability prioritization using analytic hierarchy process. Journal of Systems and Software, 85, 1699–1708.
Article Google Scholar
Liu, S., Chen, X., Liu, W., et al. (2014). FECAR: a feature selection framework for software defect prediction. In Proceedings of the 38th IEEE Annual Conference on Computer Software and Applications (pp. 426–435).
Lokan, C., & Mendes, E. (2016). Investigating the use of moving windows to improve software effort prediction: a replicated study. Empirical Software Engineering, 22(2), 716–767.
Article Google Scholar
Mell, P., Scarfone, K., & Romanosky, S. (2006). Common vulnerability scoring system. IEEE Security & Privacy, 4(6), 85–89.
Article Google Scholar
Mensah, S., Keung, J., MacDonell, S. G., et al. (2017). Investigating the significance of bellwether effect to improve software effort estimation. In IEEE International Conference on Software Quality, Reliability and Security (QRS) (pp. 340–351).
Mensah, S., Keung, J., MacDonell, S. G., et al. (2018). Investigating the significance of the bellwether effect to improve software effort prediction: Further empirical study. IEEE Transactions on Reliability, 67(3), 1176–1198.
Article Google Scholar
Menzies, T., Yang, Y., Mathew, G., et al. (2017). Negative results for software effort estimation. Empirical Software Engineering, 25(5), 2658–2683.
Article Google Scholar
Munaiah, N., Meneely, A. (2016). Vulnerability severity scoring and bounties: why the disconnect? In Proceedings of the 2nd International Workshop on Software Analytics (pp. 8–14).
Netsparker. https://www.netsparker.com/[Online accessed 22-April-2019].
Neuhaus, S., Zimmermann, T., Holler, C., et al. (2007). Predicting vulnerable software components. In Proceedings of the 14th ACM Conference on Computer and Communications Security (pp. 529–540).
Nierstrasz, O., Osman, H., Ghafari, M. (2017). Automatic feature selection by regularization to improve bug prediction accuracy. In IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) (pp. 27–32).
O’Donnell, L. (2019). Windows Users at Risk From High-Severity Intel Software Flaw. https://threatpost.com.
Pang, Y., Xue, X., Wang, H. (2017). Predicting vulnerable software components through deep neural network. In International Conference on Deep Learning Technologies (pp. 6–10).
Pelleg, D., & Moore, A. W. (2000). X-means: extending k-means with efficient estimation of the number of clusters. In Proceedings of the 7th International Conference on Machine Learning (pp. 727–734).
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137.
Article Google Scholar
Rahimi, S., & Zargham, M. (2013). Vulnerability scrying method for software vulnerability discovery prediction. IEEE Transactions on Reliability, 62(2), 395–407.
Article Google Scholar
Romano, D., Raila, P., Pinzger, M., et al. (2012). Analyzing the impact of antipatterns on change-proneness using fine-grained source code changes. In Proceedings - Working Conference on Reverse Engineering, WCRE (pp. 437–446).
Roumani, Y., Nwankpa, J. K., & Roumani, Y. F. (2015). Time series modeling of vulnerabilities. Computers & Security, 51, 32–40.
Article Google Scholar
Sahin, S. E., & Tosun, A. (2019). A conceptual replication on predicting the severity of software vulnerabilities. In Proceedings of the Evaluation and Assessment on Software Engineering.
Scandariato, R., Walden, J., Hovsepyan, A., et al. (2014). Predicting vulnerable software components via text mining. IEEE Transactions on Software Engineering, 40(10), 993–1006.
Article Google Scholar
Seo, Y. S., & Bae, D. H. (2013). On the value of outlier elimination on software effort estimation research. Empirical Software Engineering, 18(4), 659–698.
Article Google Scholar
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611.
Article MathSciNet MATH Google Scholar
Shar, L. K., & Tan, H. B. (2013). Predicting SQL injection and cross-site scripting vulnerabilities through mining input sanitization patterns. Information and Software Technology, 55(10), 1767–1780.
Article Google Scholar
Sharma, M. (2015). The way ahead for bug-fix time prediction. In Proceedings of the 3rd International Workshop on Quantitative Approaches to Software Quality (p. 33).
Sharma, G., Sharma, S., & Gujral, S. (2015). A novel way of assessing software bug severity using dictionary of critical terms. Procedia Computer Science, 70, 632–639.
Article Google Scholar
Shin, Y., Meneely, A., Williams, L., et al. (2011). Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Transactions on Software Engineering, 37(6), 772–787.
Article Google Scholar
Sibal, R., Sharma, R., & Sabharwal, S. (2017). Prioritizing software vulnerability types using multi-criteria decision-making techniques. Life Cycle Reliability and Safety Engineering, 6(1), 57–67.
Article Google Scholar
Spanos, G., & Angelis, L. (2015). Impact metrics of security vulnerabilities: analysis and weighing. Information Security Journal: A Global Perspective, 24(1–3), 57–71.
Google Scholar
Spanos, G., & Angelis, L. (2018). A multi-target approach to estimate software vulnerability characteristics and severity scores. Journal of Systems and Software, 146, 52–166.
Article Google Scholar
Spanos, G., Sioziou, A., & Angelis, L. (2013). WIVSS: a new methodology for scoring information systems vulnerabilities. In Proceedings of the 17th Panhellenic Conference on Informatics, 2013 (pp. 83–90).
Spanos, G., Angelis, L., Toloudis, D. (2017). Assessment of vulnerability severity using text mining. In Proceedings of the 21st Pan-Hellenic Conference on Informatics (p. 49).
Stuckman, J., Walden, J., & Scandariato, R. (2017). The effect of dimensionality reduction on software vulnerability prediction models. IEEE Transactions on Reliability, 66(1), 17–37.
Article Google Scholar
Telang, R., & Wattal, S. (2007). An empirical analysis of the impact of software vulnerability announcements on firm stock price. IEEE Transactions on Software Engineering, 33, 544–557.
Article Google Scholar
Toloudis, D., Spanos, G., Angelis, L. (2016). Associating the severity of vulnerabilities with their description,” in International Conference on Advanced Information Systems Engineering (pp. 231–242).
Valdivia-Garcia, H., Shihab, E., & Nagappan, M. (2018). Characterizing and predicting blocking bugs in open source projects. Journal of Systems and Software, 143, 44–58.
Article Google Scholar
Woo, S. W., Alhazmi, O. H., Malaiya, Y. K. (2006). An analysis of the vulnerability discovery process in web browsers,” in Proceedings of the 10th IASTED SEA (vol. 6, pp. 13–15).
Xiang, Y., Tang, X., & Dai, Y. (2019). Feature selection based on feature interactions with application to text categorization. Expert Systems with Applications, 120, 207–216.
Article Google Scholar
Xu, Z., Xuan, J., Liu, J., et al. (2016). MICHAC: Defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering. In Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2016 (pp. 370–381).
Younis, A. A., & Malaiya, Y. K. (2015). Comparing and evaluating CVSS base metrics and Microsoft rating system. In IEEE International Conference on Software Quality, Reliability and Security (QRS), 2015 (pp. 252–261).
Younis, A., Malaiya, Y. K., & Ray, I. (2016a). Assessing vulnerability exploitability risk using software properties. Software Quality Journal, 24, 159–202.
Article Google Scholar
Younis, A., Malaiya, Y. K., Ray, I. (2016b). Evaluating CVSS base score using vulnerability rewards programs,” in IFIP International Information Security and Privacy Conference (pp. 62–75).
Zhang, S., Caragea, D., & Ou, O. (2011). An empirical study on using the national vulnerability database to predict software vulnerabilities. International Conference on Database and Expert Systems Applications, 6860, 217–231.
Article Google Scholar
Zhang, F., Mockus, A., Khomh, F., et al. (2013). How does context affect the distribution of software maintainability metrics? In IEEE International Conference on Software Maintenance, ICSM (pp. 350–359).
Zhu, X., Cao, C., & Zhang, J. (2017). Vulnerability severity prediction and risk metric modeling for software. Applied Intelligence, 47, 828–836.
Article Google Scholar

Download references

Funding

This study was funded by the National Natural Science Foundation of China (NSFC grant numbers: U1836116, 61502205, 61762040, and 61872167), the Project of Jiangsu Provincial Six Talent Peaks (Grant numbers: XYDXXJS-016), the Graduate Research Innovation Project of Jiangsu Province (Grant numbers: KYCX17 1807), and the Postdoctoral Science Foundation of China (Grant numbers: 2015 M571687 and 2015 M581739).

Author information

Authors and Affiliations

School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, 212013, China
Patrick Kwaku Kudjo, Jinfu Chen & Richard Amankwah
Department of Computer Science, University of Ghana, Accra, Ghana
Solomon Mensah
School of Graduate Studies, Valley View University, Accra, Ghana
Christopher Kudjo

Authors

Patrick Kwaku Kudjo
View author publications
You can also search for this author in PubMed Google Scholar
Jinfu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Solomon Mensah
View author publications
You can also search for this author in PubMed Google Scholar
Richard Amankwah
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Kudjo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinfu Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kudjo, P.K., Chen, J., Mensah, S. et al. The effect of Bellwether analysis on software vulnerability severity prediction models. Software Qual J 28, 1413–1446 (2020). https://doi.org/10.1007/s11219-019-09490-1

Download citation

Published: 07 January 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11219-019-09490-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The effect of Bellwether analysis on software vulnerability severity prediction models

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Data collection and quality challenges in deep learning: a data-centric AI perspective

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The effect of Bellwether analysis on software vulnerability severity prediction models

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Data collection and quality challenges in deep learning: a data-centric AI perspective

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation