当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Collaborative Filtering Recommendation Using Nonnegative Matrix Factorization in GPU-Accelerated Spark Platform
Scientific Programming ( IF 1.672 ) Pub Date : 2021-01-05 , DOI: 10.1155/2021/8841133
Bing Tang 1 , Linyao Kang 1 , Li Zhang 1 , Feiyan Guo 1 , Haiwu He 2
Affiliation  

Nonnegative matrix factorization (NMF) has been introduced as an efficient way to reduce the complexity of data compression and its capability of extracting highly interpretable parts from data sets, and it has also been applied to various fields, such as recommendations, image analysis, and text clustering. However, as the size of the matrix increases, the processing speed of nonnegative matrix factorization is very slow. To solve this problem, this paper proposes a parallel algorithm based on GPU for NMF in Spark platform, which makes full use of the advantages of in-memory computation mode and GPU acceleration. The new GPU-accelerated NMF on Spark platform is evaluated in a 4-node Spark heterogeneous cluster using Google Compute Engine by configuring each node a NVIDIA K80 CUDA device, and experimental results indicate that it is competitive in terms of computational time against the existing solutions on a variety of matrix orders. Furthermore, a GPU-accelerated NMF-based parallel collaborative filtering (CF) algorithm is also proposed, utilizing the advantages of data dimensionality reduction and feature extraction of NMF, as well as the multicore parallel computing mode of CUDA. Using real MovieLens data sets, experimental results have shown that the parallelization of NMF-based collaborative filtering on Spark platform effectively outperforms traditional user-based and item-based CF with a higher processing speed and higher recommendation accuracy.

中文翻译:

GPU加速Spark平台中使用非负矩阵分解的协同过滤建议

非负矩阵分解(NMF)作为降低数据压缩复杂性及其从数据集中提取高度可解释部分的能力的一种有效方法已被引入,并且已被应用于各种领域,例如建议,图像分析和文本聚类。但是,随着矩阵大小的增加,非负矩阵分解的处理速度非常慢。为了解决这个问题,本文提出了一种基于GPU的Spark平台NMF并行算法,该算法充分利用了内存中计算模式和GPU加速的优势。通过在每个节点上配置NVIDIA K80 CUDA设备,并使用Google Compute Engine在4节点Spark异构集群中评估了Spark平台上新的GPU加速NMF,实验结果表明,在计算时间方面,它与各种矩阵阶上的现有解决方案相比具有竞争力。此外,还提出了利用GPU加速的基于NMF的并行协同过滤(CF)算法,该算法利用数据降维和NMF的特征提取以及CUDA的多核并行计算模式的优势。使用真实的MovieLens数据集,实验结果表明,在Spark平台上基于NMF的协作过滤的并行化以更高的处理速度和更高的推荐精度有效地优于传统的基于用户和基于项目的CF。还提出了一种利用GPU加速的基于NMF的并行协同过滤(CF)算法,该算法利用NMF数据降维和特征提取的优势以及CUDA的多核并行计算模式。使用真实的MovieLens数据集,实验结果表明,在Spark平台上基于NMF的协作过滤的并行化以更高的处理速度和更高的推荐精度有效地优于传统的基于用户和基于项目的CF。还提出了一种利用GPU加速的基于NMF的并行协同过滤(CF)算法,该算法利用NMF数据降维和特征提取的优势以及CUDA的多核并行计算模式。使用真实的MovieLens数据集,实验结果表明,在Spark平台上基于NMF的协作过滤的并行化以更高的处理速度和更高的推荐精度有效地优于传统的基于用户和基于项目的CF。
更新日期:2021-01-05
down
wechat
bug