当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the distributed software architecture of a data analysis workflow: A case study
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2021-07-26 , DOI: 10.1002/cpe.6522
Nail Tasgetiren 1 , Umit Tigrak 2 , Erdal Bozan 2 , Guven Gul 2 , Emir Demirci 2 , Hakan Saribiyik 2 , Mehmet S. Aktas 1
Affiliation  

Hybrid distributed computing software architectures gain great importance in data analysis workflows as the number of available underlying machine learning libraries and data storage systems increase. We argue that there is a need for novel approaches for software architecture designs that can enable machine learning data analysis workflows to run on top of different subsystem libraries. To address this need, we propose a hybrid distributed software architecture in this manuscript. The proposed architecture manages machine learning models for both supervised and unsupervised machine learning data analysis workflows. To show the usability of the proposed architecture, we implement a prototype for the banking sector as a case study. The prototype application includes two data analysis workflows: a workflow for predicting the loan usage tendency of customers, and a workflow for clustering the customers based on the usage patterns of banking loans. The prototype is tested on a large scale banking dataset. Performance tests were carried out to investigate the performance in terms of both responsiveness and scalability of the system. The results obtained reveal the usability of the proposed architecture.

中文翻译:

数据分析工作流的分布式软件架构:案例研究

随着可用底层机器学习库和数据存储系统数量的增加,混合分布式计算软件架构在数据分析工作流程中变得越来越重要。我们认为,需要一种新的软件架构设计方法,使机器学习数据分析工作流能够在不同的子系统库之上运行。为了满足这一需求,我们在本手稿中提出了一种混合分布式软件架构。所提出的架构管理有监督和无监督机器学习数据分析工作流的机器学习模型。为了展示所提出架构的可用性,我们为银行业实施了一个原型作为案例研究。原型应用程序包括两个数据分析工作流程:预测客户贷款使用趋势的工作流程,以及基于银行贷款使用模式的客户聚类工作流程。该原型在大规模银行数据集上进行了测试。进行了性能测试以研究系统的响应能力和可扩展性方面的性能。获得的结果揭示了所提出架构的可用性。
更新日期:2021-07-26
down
wechat
bug