Disk storage failure prediction in datacenter using machine learning models

Ramanathan, Manikandan; Narayanan, Kumar

doi:10.1007/s13204-021-02039-4

Disk storage failure prediction in datacenter using machine learning models

Original Article
Published: 21 September 2021

Volume 13, pages 1569–1590, (2023)
Cite this article

Applied Nanoscience Aims and scope Submit manuscript

Manikandan Ramanathan¹ &
Kumar Narayanan¹

223 Accesses
Explore all metrics

Abstract

Data centers are located centralized to do computation and accessing huge amount of data by the network devices which are interconnected to form the network path. Servers are stacked, data storage is placed in them. Data server backup and server redundancies are the recovery mechanisms implemented. Data centers compute, store, distribute the data by processing them and the data center controls all the interconnected network equipment in the distributed network. In current, RAID system is implemented to avoid the service disruptions due to disk failures, the availability of system and services are achieved with this expensive model. But still the availability is lost, and service disruptions happen due to disk failures, the machine learnings models to be used to predict the disk failures well in advance. Data center has increased usage of system with increased data storage, the failure in disc makes the system failed and down time increases. Analysis on the methods of problems in disk and methods of disk availability and predict the disk failure is the main goal. Various machine learning models are identified and discussed along with the SMART parameters for measuring the failure of the disk. Improved method of Ensembling of trees, random forest and boosting techniques are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

State of the Art in Defect Detection Based on Machine Vision

Article Open access 26 May 2021

A review on fault detection and diagnosis techniques: basics and beyond

Article 10 November 2020

Machine learning in agriculture: a review of crop management applications

Article 01 July 2023

References

Anantharaman P, Qiao M, Ja-dav D (2018) Large scale predictive analytics for hard disk remaining useful life estimation. In Proceedings of the 2018 IEEE International Congress on Big Data (BigData Congress), pages 251–254. IEEE
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Book Google Scholar
Dally W, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann
Greenberg A, Hamilton J, Maltz DA, Patel P (2009a) The cost of a cloud: research problems in data center networks. SIGCOMM Comput Commun Rev 39(1):68–73
Article Google Scholar
Greenberg A, Hamilton JR, Jain N, Kandula S, Kim C, Lahiri P, Maltz DA, Patel P, Sengupta S (2009b) Vl2: a scalable and flexible data center network. SIGCOMM Comput Commun Rev 39(4):51–62
Article Google Scholar
Harsh V, Jyothi SA, Singh I, Godfrey P (2018) Expander datacenters: from theory to practice., in ArXiv 2018
Kim C, Caesar M, Rexford J (2008) “Floodless in seattle: a scalable ethernet architecture for large enterprises,” in ACM SIGCOMM Computer
Liu Y, Muppala JK, Veeraraghavan M, Lin D, Hamdi M (2013) Data Center Networks, SpringerBriefs in Computer Science
Lu S, Luo B, Patel T, Yao Y, Tiwari D (2020a) Making disk failure predictions SMARTer!, 18th USENIX Conference on File and Storage Technologies
Lu S, Luo B, Patel T, Yao Y, Tiwari D, Shi W (2020b) Making disk failure predictions SMARTer!, Proceedings of 18th USENIX Conference on File and Storage Technologies (FAST ’20)
Murray JF, Hughes GF, Kreutz-Delgado K, Jolla L (2005) Machine learning methods for predicting failures in hard drives: a multiple-instance application, J Mach Learn Res
Patterson D, et al. (1988a) A case for redundant arrays of inexpensive disks (RAID), ACM SIGMOD conference
Patterson D, et al. (1988b) Introduction to redundant arrays of inexpensive disks (RAID), Report No. UCB/CSD 88/479, University of California Berkeley
Pinheiro E, Weber W-D, Barroso LA (2007a) “Failure trends in a large disk drive population,” in Proceedings of the 5th USENIX Conference disk drive population,” in Proceedings of the 5th USENIX Conference on File and Storage Technologies, FAST ’07, (Berkeley, CA, USA), pp. 17–29, USENIX Association
Pinheiro E, Weber W-D, Barroso LA (2007b) Google Inc. 1600 Amphitheatre Pkwy Mountain View, CA 94043 on Failure Trends in a Large Disk Drive Population (Conclusion section)
Schroeder B, Gibson GA (2007) “Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?,” in Proceedings of the 5th USENIX Conference on File and Storage Technologies, FAST ’07, (Berkeley, CA, USA), pp. 1–16, USENIX Association
Schulze M (1988) Considerations on the Design of a RAID Prototype, Report No. UCB/CSD 88/448, University of California Berkley
Wu K, Xiao J, Ni L (2012) Rethinking the architecture design of data center networks. Front Comput Sci 6:596–603. https://doi.org/10.1007/s11704-012-1155-6
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Vels Institute of Science Technology and Advanced Studies (VISTAS), Chennai, 600043, Tamil Nadu, India
Manikandan Ramanathan & Kumar Narayanan

Authors

Manikandan Ramanathan
View author publications
You can also search for this author in PubMed Google Scholar
Kumar Narayanan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manikandan Ramanathan.

Ethics declarations

Conflict of interest

Author declares that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramanathan, M., Narayanan, K. Disk storage failure prediction in datacenter using machine learning models. Appl Nanosci 13, 1569–1590 (2023). https://doi.org/10.1007/s13204-021-02039-4

Download citation

Received: 21 July 2021
Accepted: 18 August 2021
Published: 21 September 2021
Issue Date: February 2023
DOI: https://doi.org/10.1007/s13204-021-02039-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Disk storage failure prediction in datacenter using machine learning models

Abstract

Access this article

Similar content being viewed by others

State of the Art in Defect Detection Based on Machine Vision

A review on fault detection and diagnosis techniques: basics and beyond

Machine learning in agriculture: a review of crop management applications

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Disk storage failure prediction in datacenter using machine learning models

Abstract

Access this article

Similar content being viewed by others

State of the Art in Defect Detection Based on Machine Vision

A review on fault detection and diagnosis techniques: basics and beyond

Machine learning in agriculture: a review of crop management applications

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation