Skip to main content

Advertisement

Log in

HSACMA: a hierarchical scalable adaptive cloud monitoring architecture

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Monitoring for cloud is the key technology to know the status and the availability of the resources and services present in the current infrastructure. However, cloud monitoring faces a lot of challenges due to inefficient monitoring capability and enormous resource consumption. We study the adaptive monitoring for cloud computing platform, and focus on the problem of balancing monitoring capability and resource consumption. We proposed HSACMA, a hierarchical scalable adaptive monitoring architecture, that (1) monitors the physical and virtual infrastructure at the infrastructure layer, the middleware running at the platform layer, and the application services at the application layer; (2) achieves the scalability of the monitoring based on microservices; and (3) adaptively adjusts the monitoring interval and data transmission strategy according to the running state of the cloud computing system. Moreover, we study a case of real production system deployed and running on the cloud computing platform called CloudStack, to verify the effectiveness of applying our architecture in practice. The results show that HSACMA can guarantee the accuracy and real-time performance of monitoring and reduces resource consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://aws.amazon.com/cloudwatch/

  2. https://cloudmonix.com/aw/

  3. https://wiki.openstack.org/wiki/Ceilometer/

  4. https://newrelic.com/

  5. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/GettingStarted.html

  6. http://www.cloudstatus.com/

  7. https://status.cloud.google.com/

References

  • Andreozzi, S., Bortoli, N.D., Fantinel, S., Ghiselli, A., Rubini, G.L., Tortone, G., & Vistoli, M.C. (2005). Gridice: a monitoring service for grid systems. Future Generation Computer Systems, 21(4), 559–571.

    Article  Google Scholar 

  • Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., & Warfield, A. (2003). Xen and the art of virtualization. In Nineteenth Acm Symposium on Operating Systems Principles (pp. 164–177).

  • Barth, W. (2008). Nagios: system and network monitoring.

  • Berman, F., & Fox, G. (2003). Grid computing: making the global infrastructure a reality. John Wiley & Sons, 2, 945–962.

    Google Scholar 

  • Berrendorf, R., & Mohr, B. (1998). PCL - the performance counter library: a common interface to access hardware performance counters on microprocessors.

  • Bezemer, C.P., & Zaidman, A. (2014). Performance optimization of deployed software-as-a-service applications. Journal of Systems & Software, 87(1), 87–103.

    Article  Google Scholar 

  • Chen, G.L. (2011). Parallel computing: structures, algorithms, programming publisher Higher Education Press.

  • Chen, L., Ying, S., & Jia, X.Y. (2017). Shma: monitoring architecture for clouds. Computer Science, 44(1), 7–12.

    Google Scholar 

  • Chieu, T.C., Mohindra, A., Karve, A.A., & Segal, A. (2009). Dynamic scaling of web applications in a virtualized cloud computing environment. In IEEE International Conference on E-business Engineering.

  • Cortellessa, V., Marco, A.D., & Inverardi, P. (2011). Model-based software performance analysis.

  • George, L. (2011). HBase: the definitive guide O’Reilly Media, Inc.

  • Gogouvitis, S., Konstanteli, K., Waldschmidt, S., Kousiouris, G., Katsaros, G., Menychtas, A., Kyriazis, D., & Varvarigou, T. (2012). Workflow management for soft real-time interactive applications in virtualized environments. Future generation computer systems, 28, 193–209.

    Article  Google Scholar 

  • Han, H., Kim, S.G., Jung, H., Yeom, H.Y., Yoon, C., Park, J.W., & Lee, Y. (2009). A restful approach to the management of cloud infrastructure.

  • Huang, H., & Wang, L. (2010). P&P: a combined push-pull model for resource monitoring in cloud computing environment. In IEEE International Conference on Cloud Computing (pp. 260–267).

  • Jain, R. (1991). The art of computer systems performance analysis (techniques for experimental design, measurement, simulation, and modeling).

  • Jiang, G., Chen, H., & Yoshihira, K. (2006). Modeling and tracking of transaction flow dynamics for fault detection in complex systems. IEEE Transactions on Dependable & Secure Computing, 3(4), 312–326.

    Article  Google Scholar 

  • Jones, M.T. (2013). Process real-time big data with Twitter Storm.

  • Karau, H., Konwinski, A., Wendell, P., & Zaharia, M. (2015). Learning Spark: lightning-fast big data analytics.

  • Katsaros, G., Kousiouris, G., Gogouvitis, S.V., Kyriazis, D., Menychtas, A., & Varvarigou, T. (2012). A self-adaptive hierarchical monitoring mechanism for clouds. Journal of Systems & Software, 84(10), 1029–1041.

    Article  Google Scholar 

  • Kiczales, G. (1996). Aspect-oriented programming.

  • Kimelfeld, B., & Senellart, P. (2013). Probabilistic XML: models and complexity.

  • Konig, B., Alcaraz Calero, J., & Kirschnick, J. (2012). Elastic monitoring framework for cloud infrastructures. Iet Communications, 6 (10), 1306–1315.

    Article  Google Scholar 

  • Liu, Z.H., Hammerschmidt, B., & Mcmahon, D. (2014). JSON data management: supporting schema-less development in RDBMS. In International Conference on Management of Data (SIGMOD). ACM (pp. 1247–1258).

  • Mackiewicz, A., & Ratajczak, W. (1993). Principal components analysis (PCA). Computers & Geosciences, 19(3), 303–342.

    Article  Google Scholar 

  • Massie, M.L., Chun, B.N., & Culler, D.E. (2004). The ganglia distributed monitoring system: design, implementation, and experience. Parallel Computing, 30(7), 817–840.

    Article  Google Scholar 

  • Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. National Institute of Standards & Technology.

  • Meng, S., & Ling, L. (2013). Enhanced monitoring-as-a-service for effective cloud management. IEEE Transactions on Computers, 62(9), 1705–1720.

    Article  MathSciNet  MATH  Google Scholar 

  • Papazoglou, M.P., & Heuvel, W.J.V.D. (2007). Service oriented architectures: approaches, technologies and research issues. Vldb Journal, 16(3), 389–415.

    Article  Google Scholar 

  • Patterson, D.A. (2002). A simple way to estimate the cost of downtime.

  • Povedano-Molina, J., Lopez-Vega, J.M., Lopez-Soler, J.M., Corradi, A., & Foschini, L. (8). DARGOS: a highly adaptable and scalable monitoring architecture for multi-tenant clouds. Future Generation Computer Systems, 29, 2041–2056.

  • Rak, M., Venticinque, S., Mahr, T., Echevarria, G., & Esnal, G. (2011). Cloud application monitoring: the mosaic approach. In IEEE Third International Conference on Cloud Computing Technology and Science (pp. 758–763).

  • Reddy, P.V.V., & Rajamani, L. (2015). Performance comparison of different operating systems in the private cloud with KVM hypervisor using SIGAR framework. In International Conference on Communication, Information & Computing Technology (pp. 1–6).

  • Schwartz, B., Zaitsev, P., & Tkachenko, V. (2012). High performance MySQL: optimization, backups, and replication. O’Reilly Media, Inc.

  • Shao, J., Wei, H., Wang, Q., & Mei, H. (2010). A runtime model based monitoring approach for cloud.

  • Smith, C.U., & Williams, L.G. (2002). Performance solutions: a practical guide to creating responsive, scalable software. IEEE Software, 20(5), 103–103.

    Google Scholar 

  • Tasquier, L., Venticinque, S., Aversa, R., & Martino, B.D. (2012). Agent based application tools for cloud provisioning and management.

  • Thain, D., Tannenbaum, T., & Livny, M. (2005). Distributed computing in practice: the condor experience: Research articles, (Vol. 17.

  • Wang, R., & Ying, S. (2018). SaaS software performance issue identification using HMRF-MAP framework. Software: Practice and Experience, 48(11), 2000–2018.

    Google Scholar 

  • Wang, R., & Ying, S. (2020). SaaS software performance issues diagnosis using ICA and RBM. Concurrency and Computation: Practice and Experience, 32(14), e5729.

    Article  Google Scholar 

  • Wang, R., Ying, S., & Jia, X. (2019). Log data modeling and acquisition in supporting SaaS software performance issue diagnosis. International Journal of Software Engineering and Knowledge Engineering, 29(9), 1245–1277.

    Article  Google Scholar 

  • Wang, R., Ying, S., Sun, C., Wan, H., Zhang, H., & Jia, X. (2017). Model construction and data management of running log in supporting SaaS software performance analysis. In International Conference on Software Engineering and Knowledge Engineering (pp. 149–154).

  • Wang, S., Maier, D., & Ooi, B.C. (2014). Lightweight indexing of observational data in log-structured storage. Proceedings of the Vldb Endowment, 7(7), 529–540.

    Article  Google Scholar 

  • Zanikolas, S., & Sakellariou, R. (2005). A taxonomy of grid monitoring systems. Future Generation Computer Systems, 21(1), 163–188.

    Article  Google Scholar 

Download references

Funding

This work is supported in part by the grants of the National Natural Science Foundation of China (Grant Nos. 61672392 and 61373038) and National Key Research and Development Program of China (No. 2016YFC1202204).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shi Ying.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 7 Abbreviations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, R., Ying, S., Li, M. et al. HSACMA: a hierarchical scalable adaptive cloud monitoring architecture. Software Qual J 28, 1379–1410 (2020). https://doi.org/10.1007/s11219-020-09524-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-020-09524-z

Keywords

Navigation