Abstract
Monitoring for cloud is the key technology to know the status and the availability of the resources and services present in the current infrastructure. However, cloud monitoring faces a lot of challenges due to inefficient monitoring capability and enormous resource consumption. We study the adaptive monitoring for cloud computing platform, and focus on the problem of balancing monitoring capability and resource consumption. We proposed HSACMA, a hierarchical scalable adaptive monitoring architecture, that (1) monitors the physical and virtual infrastructure at the infrastructure layer, the middleware running at the platform layer, and the application services at the application layer; (2) achieves the scalability of the monitoring based on microservices; and (3) adaptively adjusts the monitoring interval and data transmission strategy according to the running state of the cloud computing system. Moreover, we study a case of real production system deployed and running on the cloud computing platform called CloudStack, to verify the effectiveness of applying our architecture in practice. The results show that HSACMA can guarantee the accuracy and real-time performance of monitoring and reduces resource consumption.
Similar content being viewed by others
References
Andreozzi, S., Bortoli, N.D., Fantinel, S., Ghiselli, A., Rubini, G.L., Tortone, G., & Vistoli, M.C. (2005). Gridice: a monitoring service for grid systems. Future Generation Computer Systems, 21(4), 559–571.
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., & Warfield, A. (2003). Xen and the art of virtualization. In Nineteenth Acm Symposium on Operating Systems Principles (pp. 164–177).
Barth, W. (2008). Nagios: system and network monitoring.
Berman, F., & Fox, G. (2003). Grid computing: making the global infrastructure a reality. John Wiley & Sons, 2, 945–962.
Berrendorf, R., & Mohr, B. (1998). PCL - the performance counter library: a common interface to access hardware performance counters on microprocessors.
Bezemer, C.P., & Zaidman, A. (2014). Performance optimization of deployed software-as-a-service applications. Journal of Systems & Software, 87(1), 87–103.
Chen, G.L. (2011). Parallel computing: structures, algorithms, programming publisher Higher Education Press.
Chen, L., Ying, S., & Jia, X.Y. (2017). Shma: monitoring architecture for clouds. Computer Science, 44(1), 7–12.
Chieu, T.C., Mohindra, A., Karve, A.A., & Segal, A. (2009). Dynamic scaling of web applications in a virtualized cloud computing environment. In IEEE International Conference on E-business Engineering.
Cortellessa, V., Marco, A.D., & Inverardi, P. (2011). Model-based software performance analysis.
George, L. (2011). HBase: the definitive guide O’Reilly Media, Inc.
Gogouvitis, S., Konstanteli, K., Waldschmidt, S., Kousiouris, G., Katsaros, G., Menychtas, A., Kyriazis, D., & Varvarigou, T. (2012). Workflow management for soft real-time interactive applications in virtualized environments. Future generation computer systems, 28, 193–209.
Han, H., Kim, S.G., Jung, H., Yeom, H.Y., Yoon, C., Park, J.W., & Lee, Y. (2009). A restful approach to the management of cloud infrastructure.
Huang, H., & Wang, L. (2010). P&P: a combined push-pull model for resource monitoring in cloud computing environment. In IEEE International Conference on Cloud Computing (pp. 260–267).
Jain, R. (1991). The art of computer systems performance analysis (techniques for experimental design, measurement, simulation, and modeling).
Jiang, G., Chen, H., & Yoshihira, K. (2006). Modeling and tracking of transaction flow dynamics for fault detection in complex systems. IEEE Transactions on Dependable & Secure Computing, 3(4), 312–326.
Jones, M.T. (2013). Process real-time big data with Twitter Storm.
Karau, H., Konwinski, A., Wendell, P., & Zaharia, M. (2015). Learning Spark: lightning-fast big data analytics.
Katsaros, G., Kousiouris, G., Gogouvitis, S.V., Kyriazis, D., Menychtas, A., & Varvarigou, T. (2012). A self-adaptive hierarchical monitoring mechanism for clouds. Journal of Systems & Software, 84(10), 1029–1041.
Kiczales, G. (1996). Aspect-oriented programming.
Kimelfeld, B., & Senellart, P. (2013). Probabilistic XML: models and complexity.
Konig, B., Alcaraz Calero, J., & Kirschnick, J. (2012). Elastic monitoring framework for cloud infrastructures. Iet Communications, 6 (10), 1306–1315.
Liu, Z.H., Hammerschmidt, B., & Mcmahon, D. (2014). JSON data management: supporting schema-less development in RDBMS. In International Conference on Management of Data (SIGMOD). ACM (pp. 1247–1258).
Mackiewicz, A., & Ratajczak, W. (1993). Principal components analysis (PCA). Computers & Geosciences, 19(3), 303–342.
Massie, M.L., Chun, B.N., & Culler, D.E. (2004). The ganglia distributed monitoring system: design, implementation, and experience. Parallel Computing, 30(7), 817–840.
Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. National Institute of Standards & Technology.
Meng, S., & Ling, L. (2013). Enhanced monitoring-as-a-service for effective cloud management. IEEE Transactions on Computers, 62(9), 1705–1720.
Papazoglou, M.P., & Heuvel, W.J.V.D. (2007). Service oriented architectures: approaches, technologies and research issues. Vldb Journal, 16(3), 389–415.
Patterson, D.A. (2002). A simple way to estimate the cost of downtime.
Povedano-Molina, J., Lopez-Vega, J.M., Lopez-Soler, J.M., Corradi, A., & Foschini, L. (8). DARGOS: a highly adaptable and scalable monitoring architecture for multi-tenant clouds. Future Generation Computer Systems, 29, 2041–2056.
Rak, M., Venticinque, S., Mahr, T., Echevarria, G., & Esnal, G. (2011). Cloud application monitoring: the mosaic approach. In IEEE Third International Conference on Cloud Computing Technology and Science (pp. 758–763).
Reddy, P.V.V., & Rajamani, L. (2015). Performance comparison of different operating systems in the private cloud with KVM hypervisor using SIGAR framework. In International Conference on Communication, Information & Computing Technology (pp. 1–6).
Schwartz, B., Zaitsev, P., & Tkachenko, V. (2012). High performance MySQL: optimization, backups, and replication. O’Reilly Media, Inc.
Shao, J., Wei, H., Wang, Q., & Mei, H. (2010). A runtime model based monitoring approach for cloud.
Smith, C.U., & Williams, L.G. (2002). Performance solutions: a practical guide to creating responsive, scalable software. IEEE Software, 20(5), 103–103.
Tasquier, L., Venticinque, S., Aversa, R., & Martino, B.D. (2012). Agent based application tools for cloud provisioning and management.
Thain, D., Tannenbaum, T., & Livny, M. (2005). Distributed computing in practice: the condor experience: Research articles, (Vol. 17.
Wang, R., & Ying, S. (2018). SaaS software performance issue identification using HMRF-MAP framework. Software: Practice and Experience, 48(11), 2000–2018.
Wang, R., & Ying, S. (2020). SaaS software performance issues diagnosis using ICA and RBM. Concurrency and Computation: Practice and Experience, 32(14), e5729.
Wang, R., Ying, S., & Jia, X. (2019). Log data modeling and acquisition in supporting SaaS software performance issue diagnosis. International Journal of Software Engineering and Knowledge Engineering, 29(9), 1245–1277.
Wang, R., Ying, S., Sun, C., Wan, H., Zhang, H., & Jia, X. (2017). Model construction and data management of running log in supporting SaaS software performance analysis. In International Conference on Software Engineering and Knowledge Engineering (pp. 149–154).
Wang, S., Maier, D., & Ooi, B.C. (2014). Lightweight indexing of observational data in log-structured storage. Proceedings of the Vldb Endowment, 7(7), 529–540.
Zanikolas, S., & Sakellariou, R. (2005). A taxonomy of grid monitoring systems. Future Generation Computer Systems, 21(1), 163–188.
Funding
This work is supported in part by the grants of the National Natural Science Foundation of China (Grant Nos. 61672392 and 61373038) and National Key Research and Development Program of China (No. 2016YFC1202204).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Wang, R., Ying, S., Li, M. et al. HSACMA: a hierarchical scalable adaptive cloud monitoring architecture. Software Qual J 28, 1379–1410 (2020). https://doi.org/10.1007/s11219-020-09524-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-020-09524-z