当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimizing UPC Programs for Multi-Core Systems
Scientific Programming Pub Date : 2010 , DOI: 10.3233/spr-2010-0310
Yili Zheng

The Partitioned Global Address Space (PGAS) model of Unified Parallel C (UPC) can help users express and manage application data locality on non-uniform memory access (NUMA) multi-core shared-memory systems to get good performance. First, we describe several UPC program optimization techniques that are important to achieving good performance on NUMA multi-core computers with examples and quantitative performance results. Second, we use two numerical computing kernels, parallel matrix–matrix multiplication and parallel 3-D FFT, to demonstrate the end-to-end development and optimization for UPC applications. Our results show that the optimized UPC programs achieve very good and scalable performance on current multi-core systems and can even outperform vendor-optimized libraries in some cases.

中文翻译:

针对多核系统优化UPC程序

统一并行C(UPC)的分区全局地址空间(PGAS)模型可以帮助用户在非统一内存访问(NUMA)多核共享内存系统上表达和管理应用程序数据局部性,以获得良好的性能。首先,我们通过示例和定量性能结果描述了几种对在NUMA多核计算机上实现良好性能至关重要的UPC程序优化技术。其次,我们使用两个数值计算内核,即并行矩阵-矩阵乘法和并行3-D FFT,来演示UPC应用程序的端到端开发和优化。我们的结果表明,优化的UPC程序在当前的多核系统上可以实现非常好的可扩展性能,甚至在某些情况下甚至可以胜过供应商优化的库。
更新日期:2020-09-25
down
wechat
bug