当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Parameterized Specification, Configuration and Execution of Data-Intensive Scientific Workflows.
Cluster Computing ( IF 4.4 ) Pub Date : 2010-09-01 , DOI: 10.1007/s10586-010-0133-8
Vijay S Kumar 1 , Tahsin Kurc , Varun Ratnakar , Jihie Kim , Gaurang Mehta , Karan Vahi , Yoonju Lee Nelson , P Sadayappan , Ewa Deelman , Yolanda Gil , Mary Hall , Joel Saltz
Affiliation  

Data analysis processes in scientific applications can be expressed as coarse-grain workflows of complex data processing operations with data flow dependencies between them. Performance optimization of these workflows can be viewed as a search for a set of optimal values in a multidimensional parameter space consisting of input performance parameters to the applications that are known to affect their execution times. While some performance parameters such as grouping of workflow components and their mapping to machines do not affect the accuracy of the analysis, others may dictate trading the output quality of individual components (and of the whole workflow) for performance. This paper describes an integrated framework which is capable of supporting performance optimizations along multiple such parameters. Using two real-world applications in the spatial, multidimensional data analysis domain, we present an experimental evaluation of the proposed framework.

中文翻译:

数据密集型科学工作流的参数化规范、配置和执行。

科学应用中的数据分析过程可以表示为复杂数据处理操作的粗粒度工作流,它们之间具有数据流依赖性。可以将这些工作流的性能优化视为在多维参数空间中搜索一组最优值,这些参数空间由已知会影响其执行时间的应用程序的输入性能参数组成。虽然一些性能参数(例如工作流组件的分组及其到机器的映射)不会影响分析的准确性,但其他参数可能会要求用单个组件(和整个工作流)的输出质量来换取性能。本文描述了一个集成框架,该框架能够支持多个此类参数的性能优化。
更新日期:2019-11-01
down
wechat
bug