当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Implementation and Performance Modeling of Deterministic Particle Transport (Sweep3D) on the IBM Cell/B.E.
Scientific Programming Pub Date : 2009 , DOI: 10.3233/spr-2009-0266
Olaf Lubeck, Michael Lang, Ram Srinivasan, Greg Johnson

The IBM Cell Broadband Engine (BE) is a novel multi-core chip with the potential for the demanding floating point performance that is required for high-fidelity scientific simulations. However, data movement within the chip can be a major challenge to realizing the benefits of the peak floating point rates. In this paper, we present the results of implementing Sweep3D on the Cell/B.E. using an intra-chip message passing model that minimizes data movement. We compare the advantages/disadvantages of this programming model with a previous implementation using a master–worker threading strategy. We apply a previously validated micro-architecture performance model for the application executing on the Cell/B.E. (based on our previous work in Monte Carlo performance models), that predicts overall CPI (cycles per instruction), and gives a detailed breakdown of processor stalls. Finally, we use the micro-architecture model to assess the performance of future design parameters for the Cell/B.E. micro-architecture. The methodologies and results have broader implications that extend to multi-core architectures.

中文翻译:

IBM Cell / BE上确定性粒子传输(Sweep3D)的实现和性能建模

IBM Cell宽带引擎(BE)是一种新颖的多核芯片,具有实现高保真科学仿真所需的苛刻浮点性能的潜力。但是,芯片内的数据移动可能是实现峰值浮点速率优势的主要挑战。在本文中,我们介绍了使用最小化数据移动的芯片内消息传递模型在Cell / BE上实现Sweep3D的结果。我们将这种编程模型的优缺点与以前使用主从线程策略的实现进行了比较。我们针对在Cell / BE上执行的应用程序应用了先前验证的微体系结构性能模型(基于我们之前在蒙特卡洛性能模型中的工作),该模型可预测总体CPI(每条指令的周期),并详细列出了处理器停顿的情况。最后,我们使用微体系结构模型来评估Cell / BE微体系结构未来设计参数的性能。这些方法和结果具有更广泛的含义,可扩展到多核体系结构。
更新日期:2020-09-25
down
wechat
bug