当前位置: X-MOL 学术IEEE Trans. Vis. Comput. Graph. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Distributed Interactive Visualization Using GPU-Optimized Spark
IEEE Transactions on Visualization and Computer Graphics ( IF 5.2 ) Pub Date : 2020-04-27 , DOI: 10.1109/tvcg.2020.2990894
Sumin Hong , Junyoung Choi , Won-Ki Jeong

With the advent of advances in imaging and computing technologies, large-scale data acquisition and processing have become commonplace in many science and engineering disciplines. Conventional workflows for large-scale data processing usually rely on in-house or commercial software that are designed for domain-specific computing tasks. Recent advances in MapReduce, which was originally developed for batch processing textual data via a simplified programming model of the map and reduce functions, have expanded its applications to more general tasks in big-data processing, such as scientific computing, and biomedical image processing. However, as shown in previous work, volume rendering and visualization using MapReduce is still considered challenging and impractical owing to the disk-based, batch-processing nature of its computing model. In this article, contrary to this common belief, we show that the MapReduce computing model can be effectively used for interactive visualization. Our proposed system is a novel extension of Spark, one of the most popular open-source MapReduce frameworks, which offers GPU-accelerated MapReduce computing. To minimize CPU-GPU communication and overcome slow, disk-based shuffle performance, the proposed system supports GPU in-memory caching and MPI-based direct communication between compute nodes. To allow for GPU-accelerated in-situ visualization using raster graphics in Spark, we leveraged the CUDA-OpenGL interoperability, resulting in faster processing speeds by several orders of magnitude compared to conventional MapReduce systems. We demonstrate the performance of our system via several volume processing and visualization tasks, such as direct volume rendering, iso-surface extraction, and numerical simulations with in-situ visualization.

中文翻译:

使用 GPU 优化 Spark 的分布式交互式可视化

随着成像和计算技术的进步,大规模数据采集和处理在许多科学和工程学科中变得司空见惯。用于大规模数据处理的传统工作流通常依赖于专为特定领域计算任务而设计的内部或商业软件。MapReduce 的最新进展最初是为通过 map 和 reduce 函数的简化编程模型批量处理文本数据而开发的,现已将其应用扩展到大数据处理中更一般的任务,例如科学计算和生物医学图像处理。然而,正如之前的工作所示,由于其计算模型的基于磁盘的批处理特性,使用 MapReduce 的体积渲染和可视化仍然被认为具有挑战性和不切实际。在本文中,与这种普遍看法相反,我们展示了 MapReduce 计算模型可以有效地用于交互式可视化。我们提出的系统是 Spark 的新扩展,Spark 是最受欢迎的开源 MapReduce 框架之一,它提供 GPU 加速的 MapReduce 计算。为了最大限度地减少 CPU-GPU 通信并克服缓慢的、基于磁盘的 shuffle 性能,所提出的系统支持 GPU 内存缓存和计算节点之间基于 MPI 的直接通信。为了在 Spark 中使用光栅图形进行 GPU 加速的原位可视化,我们利用了 CUDA-OpenGL 互操作性,与传统的 MapReduce 系统相比,处理速度提高了几个数量级。我们通过几个体积处理和可视化任务展示了我们系统的性能,
更新日期:2020-04-27
down
wechat
bug