当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Stochastic gradient descent-based support vector machines training optimization on Big Data and HPC frameworks
Concurrency and Computation: Practice and Experience ( IF 2 ) Pub Date : 2021-03-30 , DOI: 10.1002/cpe.6292
Vibhatha Abeykoon 1 , Geoffrey Fox 1 , Minje Kim 1 , Saliya Ekanayake 2 , Supun Kamburugamuve 1 , Kannan Govindarajan 1 , Pulasthi Wickramasinghe 1 , Niranda Perera 1 , Chathura Widanage 1 , Ahmet Uyar 1 , Gurhan Gunduz 1 , Selahatin Akkas 1
Affiliation  

Support vector machines (SVM) is a widely used machine learning algorithm. With the increasing amount of research data nowadays, understanding how to do efficient training is more important than ever. This article discusses the performance optimizations and benchmarks related to providing high-performance support for SVM training. In this research, we have focused on a highly scalable gradient descent-based approach to implementing the core SVM algorithm. In providing a scalable solution, we have designed optimized high-performance computing and dataflow-oriented SVM implementations. A high-performance computing approach means the algorithm is implemented with the bulk synchronous parallel (BSP) model. In addition, we analyzed the language level optimizations and math kernel optimizations on a prominent HPC modeling programming language (C++) and dataflow modeling programming language (Java). In the experiments, we compared the performance of classic HPC models, classic dataflow models, and hybrid models designed on classic HPC and dataflow programming models. Our research illustrates a scientific approach in designing the SVM algorithm at scale in classic HPC, dataflow, and hybrid systems.

中文翻译:

基于随机梯度下降的支持向量机在大数据和 HPC 框架上的训练优化

支持向量机(SVM)是一种广泛使用的机器学习算法。随着当今研究数据量的增加,了解如何进行有效的培训比以往任何时候都更加重要。本文讨论了与为 SVM 训练提供高性能支持相关的性能优化和基准。在这项研究中,我们专注于一种高度可扩展的基于梯度下降的方法来实现核心 SVM 算法。在提供可扩展的解决方案时,我们设计了优化的高性能计算和面向数据流的 SVM 实现。高性能计算方法意味着该算法是使用批量同步并行 (BSP) 模型实现的。此外,我们分析了一种著名的 HPC 建模编程语言 (C++) 和数据流建模编程语言 (Java) 的语言级别优化和数学内核优化。在实验中,我们比较了经典 HPC 模型、经典数据流模型以及基于经典 HPC 和数据流编程模型设计的混合模型的性能。我们的研究说明了在经典 HPC、数据流和混合系统中大规模设计 SVM 算法的科学方法。
更新日期:2021-03-30
down
wechat
bug