Abstract
This study aimed to develop a dynamic benchmark automation suite to measure a range of benchmark performances and evaluate the various high-performance computing (HPC) systems. Our suite supports an automated scaling test and profiling data based on hardware performance counters to analyze the system characteristics. We selected four HPC benchmarks—STREAM, High-Performance Linpack, High-Performance Conjugate Gradient, and NAS Parallel Benchmark-for experiments and configured testbeds based on five different systems and an Intel Knights Landing (KNL) cluster with 16 nodes. The Intel KNL system showed both unstable memory and high benchmark performances for a specific input range. Modern Intel systems also exhibited proper characteristics on compute-intensive workloads, whereas the up-to-date AMD system showed high efficiency and proper characteristics on memory-intensive and real application workloads. We also verified that each system has an optimal environment and characteristic for various combinations of experimental variables and profiling data.
Similar content being viewed by others
References
Theis, T.N., Wong, H.-S.P.: The end of Moore’s Law: a new beginning for information technology. Comput. Sci. Eng. 19, 41–50 (2017)
Sodani, A.: Knights Landing (KNL): 2nd Generation Intel\(\textregistered\) Xeon Phi processor. In: Hot Chips 27 Symposium (HCS) IEEE (2015)
Intel Xeon Scalable Processors. https://ark.intel.com/ko/products/series/125191/. Accessed 26 Sept. (2019)
Doweck, J., Kao, W.F., Lu, A.K.-Y., Mandelblat, J., Rahatekar, A., Rappoport, L., Rotem, E., Yasin, A., Yoaz, A.: Inside 6th-generation Intel core: new microarchitecture code-named Skylake. IEEE Micro 37, 52–62 (2017)
White, S.: High-performance power-efficient x86-64 server and desktop processors using the core codenamed bulldozer. In: Hot Chips 23 Symposium (HCS) IEEE (2011)
Lepak, K., Talbot, G., White, S., Beck, N., Naffziger, S.: The next generation AMD enterprise server product architecture. In: Hot Chips 29 Symposium (HCS) IEEE (2017)
Beck, N., White, S., Paraschou, M., Naffziger, S.: ‘Zeppelin’: an SoC for multichip architectures. IEEE ISSCC Dig. Tech., pp. 40–42 (2018)
McCalpin, J.D.: STREAM: Sustainable Memory Bandwidth in High-Performance Computers. A continually updated technical report (1991–2007). http://www.cs.virginia.edu/stream/. Accessed 20 July 2019
Dongarra, J.J., Luszczek, P., Petitet, A.: The LINPACK benchmark: past, present, and future. Concurr. Comput. 15(9), 803–820 (2003)
Dongarra, J., Heroux, M.: Toward a new metric for ranking high-performance computing systems. Sandia National Laboratory. Tech. Rep. 4744 (2013)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS parallel benchmarks, summary and preliminary results. In: Proceedings of the ACM/IEEE Conference on Supercomputing, pp. 158–165 (1991)
Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco, CA (1996)
Kumar, S.: A benchmark suite for evaluating configurable computing systems-status, reflections, and future directions. In: FPGA2000 Eighth International Symposium on FPGAs (2000)
Jamieson, P., Becker, T., Luk, W., Cheung, P., Rissa, T.: Benchmarking reconfigurable architectures in the mobile domain. In: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (2009)
Jamieson, P., Sanaullah, A., Herbordt, M.: Benchmarking heterogeneous HPC systems including reconfigurable fabrics: community aspirations for ideal comparisons. In: 2018 IEEE High-Performance Extreme Computing Conference (HPEC), pp. 1–6 (2018)
Ahmed, A., Skadron, K.: Hopscotch: A micro-benchmark suite for memory performance evaluation. In: MEMSYS ’19: Proceedings of the International Symposium on Memory Systems, pp. 67–172 (2019)
Luszczek, P., Dongarra, J.J., Koester, D., Rabenseifner, R., Lucas, B., Kepner, J., Mccalpin, J., Bailey, D., Takahashi, D.: Introduction to the HPC challenge benchmark suite. Lawrence Berkeley National Laboratory Tech, Rep (2005)
Dongarra, J., Luszczek, P., Vetter, J.: HPC challenge: Design history and implementation highlights. In: Contemporary High-Performance Computing: From Petascale Toward Exascale. CRC Computational Science Series. Taylor and Francis, Boca Raton, FL (2013)
Larabel, M., Tippett, M.: Phoronix test suite. https://www.phoronix-test-suite.com/. Accessed 26 Sept 2019
Hanussek, M., Bartusch, F., Kruger, J.: BOOTABLE: bioinformatics benchmark tool suite for applications and hardware. Future Gener. Comput. Syst. 102, 1016–1026 (2020)
Bader D., Li, Y., Li, T., Sachdeva, V.: BioPerf: A benchmark suite to evaluate high-performance computer architecture on bioinformatics applications. In: Proceedings of the IEEE International Symposium on Workload Characterization (IISWC) (2005)
Albayraktaroglu, K., Jaleel, A., Wu, X., Franklin, M., Jacob, B., Tseng, C., Yeung, D.: BioBench: a benchmark suite of bioinformatics applications. In: Proceedings of the 5th International Symposium on Performance Analysis of Systems and Software (ISPASS) (2005)
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (2008)
Dixit, K.: The SPEC benchmarks. Parallel Comput. 17(10–11), 1195–1209 (1991)
Sharkawi, S., DeSota, D., Panda, R., Indukuru, R., Stevens, S., Taylor. V., Wu, X.: Performance projection of HPC applications using SPEC CFP2006 benchmarks. In: Proceedings of the International Parallel and Distributed Processing Symposium, ser. IPDPS ’09 (2009)
Turner, A.: UK National HPC Benchmarks. Technical report, UK National Supercomputing Service ARCHER (2016)
Danalis, A., Marin, G., McCurdy. C., Meredith, J.S., Roth, P.C., Spafford. K., Tipparaju. V., Vetter, J.S.: The Scalable Heterogeneous Computing (SHOC) benchmark suite. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 63–74 (2010)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J., Lee, S.-H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: Proc. of Int’l Symp. on Workload Characterization (IISWC 2009), pp. 44–54 (2009)
Adams, M. F., Brown, J., Shalf, J., Straalen, B.V., Strohmaier, E., Williams, S.: HPGMG 1.0: A benchmark for ranking high performance computing systems. Lawrence Berkley National Laboratory Tech. Rep., LBNL-6630E (2014)
Marjanovic, V., Gracia, J., Glass, C.W.: HPC benchmarking: problem size matters. In: Proceedings of 7th International Workshop on Performance Modeling Benchmarking and Simulation of High-Performance Computer Systems, pp. 1–10 (2016)
Xing, F., You, H., Lu, C.: HPC benchmark assessment with statistical analysis. Procedia Comput. Sci. 29, 210–219 (2014)
Mehrotra, P., Djomehri, J., Heistand, S., Hood, R., Jin, H., Lazanoff, A., Saini, S., Biswas, R.: Performance evaluation of Amazon EC2 for NASA HPC applications. In: Proceedings of the 3rd Workshop on Scientific Cloud Computing Date—ScienceCloud’12, pp. 41–50 (2012)
Laukemann, J., Hammer, J., Hofmann, J., Hager, G., Wellein, G.: Automated instruction stream throughput prediction for Intel and AMD microarchitectures. In: IEEE/ACM Performance Modeling Benchmarking and Simulation of High Performance Computer Systems (2018)
Hager, G., Wellein, G.: Introduction to High Performance Computing for Scientists and Engineers, 1st edn. CRC Press Inc, Boca Raton (2010)
Stegailov, V., Vecher, V.: Efficiency analysis of intel and AMD x86\_64 architectures for Ab initio calculations: a case study of VASP. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2017. CCIS, vol. 793, pp. 430–441. (2017). https://doi.org/10.1007/978-3-319-71255-0_35
Kresse, G., Furthmüller, J.: Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996). https://doi.org/10.1103/PhysRevB.54.11169
Alamsyah, M., Utomo, A., Gunawan, P.: Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS. J. Phys. Conf. Ser. 971(1), 012022 (2018)
Koshizuka, S., Oka, Y.: Moving particle semi-implicit method for fragmentation of incompressible fluid. Nucl. Sci. Eng. 123(3), 421–434 (1996)
Molka, D., Hackenberg, D., Schöne, R.: Main memory and cache performance of Intel Sandy Bridge and AMD Bulldozer. Mem. Syst. Perf. Corr. 4, 1–4 (2014)
Juckeland, G., Kluge, M., Nagel, W.E., Pflüger, S.: Performance analysis with BenchIT: portable, flexible, easy to use. In: Proceedings of the International Conference on Quantitative Evaluation of Systems, pp. 320–321 (2004)
Prakash, T.K., Peng, L.: Performance characterization of SPEC CPU2006 benchmarks on Intel core 2 duo processor. ISAST Trans. Comput. Softw. Eng. 2(1), 36–41 (2008)
Mantovani, F., Garcia, G.M., Gracia, J., Stafford, E., Banchelli, F., Josep, F.M., Criado, L.J., Nachtmann, M.: Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU. Future Gener. Comp. Syst. 112, 800–818 (2020)
Kaliszan, D., Meyer, N., Petruczynik, S., Gienger, M., Gogolenko, S.: HPC processors benchmarking assessment for global system science applications. Supercomput. Front. Innov. 6(2), 12–28 (2019)
Jarus, M., Varette, S., Oleksiak, A., Bouvry, P.: Performance evaluation and energy efficiency of high-density HPC platforms based on Intel, AMD and ARM processors. In Energy Efficiency in Large Scale Distributed Systems. Ser. Lect. Notes Comput. Sci., vol. 8046, pp. 182–200 (2013)
Treibig, J., Hager, G., Wellein, G.: Likwid: A lightweight performance-oriented tool suite for x86 multicore environments. In: Parallel Processing Workshops (ICPPW), pp. 207–216 (2010)
likwid-powermeter. Tool for accessing RAPL counters on Intel processor. https://github.com/RRZE-HPC/likwid/wiki/Likwid-Powermeter. Accessed 15 Feb 2020
Park, G., Rho, S., Kim, J.-S., Nam, D.: Towards optimal scheduling policy for heterogeneous memory architecture in many-core system. Cluster Comput. 22, 121–133 (2019)
Choi, J., Park, G., Nam, D.: Interference-aware co-scheduling method based on classification of application characteristics from hardware performance counter using data mining. Cluster Comput. 23, 57–69 (2020)
Bitzes, G., Nowak, A.: The overhead of profiling using PMU hardware counters. CERN openlab report (2014)
Raman, K.: Optimizing memory bandwidth in Knights Landing on STREAM Triad. https://software.intel.com/en-us/articles/optimizing-memory-bandwidth-in-knights-landing-on-stream-triad. Accessed 26 Sept (2019)
Introducing the Intel®Threading Building Blocks (Intel®TBB). https://software.intel.com/en-us/node/725943. Accessed 26 Sept 2019
Developer guide for Intel®Math Kernel Library for Linux. https://software.intel.com/en-us/mkl-linux-developer-guide-getting-started-with-intel-optimized-hpcg. Accessed 26 Sept 2019
Pasquale, J., Bittel, B., Kraiman, D.: A static and dynamic workload characterization study of the San Diego Supercomputer Center CRAY X-MP. In: Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (1991)
Brandon, C., Thorsten, K., Brian, A., Samuel, W., Jack, D.: Performance variability on Xeon Phi. In: International Conference on High-Performance Computing, pp. 419–429 (2017)
Acknowledgements
This work was partly supported by Creative Allied Project (CAP) Grant funded by the Korean government (MSIT) (No. G-19-GT-CU01) and Korea Institute of Science and Technology Information (KISTI) Grant (No. K-19-L02-C06-S01).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rho, S., Park, G., Choi, J.E. et al. Development of benchmark automation suite and evaluation of various high-performance computing systems. Cluster Comput 24, 159–179 (2021). https://doi.org/10.1007/s10586-020-03167-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-020-03167-2