当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hybrid-DCA: A double asynchronous approach for stochastic dual coordinate ascent.
Journal of Parallel and Distributed Computing ( IF 3.8 ) Pub Date : 2020-04-13 , DOI: 10.1016/j.jpdc.2020.04.002
Soumitra Pal 1 , Tingyang Xu 2 , Tianbao Yang 3 , Sanguthevar Rajasekaran 4 , Jinbo Bi 4
Affiliation  

In prior works, stochastic dual coordinate ascent (SDCA) has been parallelized in a multi-core environment where the cores communicate through shared memory, or in a multi-processor distributed memory environment where the processors communicate through message passing. In this paper, we propose a hybrid SDCA framework for multi-core clusters, the most common high performance computing environment that consists of multiple nodes each having multiple cores and its own shared memory. We distribute data across nodes where each node solves a local problem in an asynchronous parallel fashion on its cores, and then the local updates are aggregated via an asynchronous across-node update scheme. The proposed double asynchronous method converges to a global solution for L-Lipschitz continuous loss functions, and at a linear convergence rate if a smooth convex loss function is used. Extensive empirical comparison has shown that our algorithm scales better than the best known shared-memory methods and runs faster than previous distributed-memory methods. Big datasets, such as one of 280 GB from the LIBSVM repository, cannot be accommodated on a single node and hence cannot be solved by a parallel algorithm. For such a dataset, our hybrid algorithm takes less than 30 s to achieve a duality gap of 105 on 16 nodes each using 12 cores, which is significantly faster than the best known distributed algorithms, such as CoCoA+, that take more than 160 s on 16 nodes.



中文翻译:

混合DCA:一种用于随机双坐标上升的双异步方法。

在先前的工作中,随机双坐标上升(SDCA)已在多核环境(其中核通过共享内存进行通信)或多处理器分布式内存环境(其中处理器通过消息传递进行通信)中进行了并行化。在本文中,我们提出了一种用于多核群集的混合SDCA框架,这是最常见的高性能计算环境,该环境由多个节点(每个节点具有多个内核)和自己的共享内存组成。我们在节点之间分布数据,其中每个节点在其核心上以异步并行方式解决本地问题,然后通过异步跨节点更新方案聚合本地更新。所提出的双重异步方法已收敛到针对大号-Lipschitz连续损失函数,如果使用平滑的凸损失函数,则为线性收敛速率。大量的经验比较表明,我们的算法的扩展性优于最著名的共享内存方法,并且比以前的分布式内存方法运行得更快。大数据集(例如LIBSVM存储库中的280 GB数据集)无法容纳在单个节点上,因此无法通过并行算法求解。对于这样的数据集,我们的混合算法需要不到30秒的时间才能达到1个0-5 在16个节点上每个节点使用12个核,这比最著名的分布式算法(例如CoCoA +)要快得多,后者在16个节点上花费160 s以上。

更新日期:2020-04-13
down
wechat
bug