当前位置: X-MOL 学术Int. J. Comput. Sci. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable and fault tolerant orthogonalization based on randomized distributed data aggregation.
Journal of Computational Science ( IF 3.3 ) Pub Date : 2013-02-15 , DOI: 10.1016/j.jocs.2013.01.006
Wilfried N Gansterer 1 , Gerhard Niederbrucker 1 , Hana Straková 1 , Stefan Schulze Grotthoff 1
Affiliation  

The construction of distributed algorithms for matrix computations built on top of distributed data aggregation algorithms with randomized communication schedules is investigated. For this purpose, a new aggregation algorithm for summing or averaging distributed values, the push-flow algorithm, is developed, which achieves superior resilience properties with respect to failures compared to existing aggregation methods. It is illustrated that on a hypercube topology it asymptotically requires the same number of iterations as the optimal all-to-all reduction operation and that it scales well with the number of nodes. Orthogonalization is studied as a prototypical matrix computation task. A new fault tolerant distributed orthogonalization method rdmGS, which can produce accurate results even in the presence of node failures, is built on top of distributed data aggregation algorithms.



中文翻译:

基于随机分布式数据聚合的可扩展和容错正交化。

研究了构建在具有随机通信调度的分布式数据聚合算法之上的用于矩阵计算的分布式算法的构造。为此,开发了一种用于对分布式值求和或求平均的新聚合算法,即推流算法,与现有聚合方法相比,该算法在故障方面具有卓越的弹性。它表明,在超立方体拓扑上,它渐近地需要与最优全对全归约操作相同的迭代次数,并且它随着节点数量的增加而扩展。正交化被研究为一个典型的矩阵计算任务。一种新的容错分布式正交化方法rdmGS,即使在存在节点故障的情况下也能产生准确的结果,

更新日期:2013-02-15
down
wechat
bug