当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Communication-efficient distributed large-scale sparse multinomial logistic regression
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2020-12-13 , DOI: 10.1002/cpe.6148
Dajiang Lei 1 , Jie Huang 1 , Hao Chen 1 , Jie Li 1 , Yu Wu 2
Affiliation  

Sparse multinomial logistic regression (SMLR) is widely used in image classification and text classification due to its feature selection and probabilistic output. However, the traditional SMLR algorithm cannot satisfy the memory and time needs of big data, which makes it necessary to propose a new distributed solution algorithm. The existing distributed SMLR algorithm has some shortcomings in network strategy and cannot make full use of the computing resources of the current high-performance cluster. Therefore, we propose communication-efficient sparse multinomial logistic regression (CESMLR), which adopts the efficient network strategy of each node to solve the SMLR subproblem and achieve a large number of data partitions, taking full advantage of the computing resources of the cluster to achieve an efficient SMLR solution. The big data experimental results show that the performance of our algorithm exceeds those of state-of-the-art algorithms. CESMLR is suitable for processing tasks with high-dimensional features and consumes less running time while maintaining high classification accuracy.

中文翻译:

通信高效的分布式大规模稀疏多项式逻辑回归

稀疏多项式逻辑回归(SMLR)由于其特征选择和概率输出而广泛应用于图像分类和文本分类。然而传统的SMLR算法无法满足大数据的内存和时间需求,这使得有必要提出一种新的分布式求解算法。现有的分布式SMLR算法在网络策略上存在一定的缺陷,无法充分利用当前高性能集群的计算资源。因此,我们提出通信高效的稀疏多项式逻辑回归(CESMLR),采用各节点高效的网络策略来解决SMLR子问题并实现大量数据分区,充分利用集群的计算资源来实现高效的 SMLR 解决方案。大数据实验结果表明,我们的算法的性能超过了最先进的算法。CESMLR适合处理具有高维特征的任务,在保持高分类精度的同时消耗较少的运行时间。
更新日期:2020-12-13
down
wechat
bug