当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data Science Methodologies: Current Challenges and Future Approaches
Big Data Research ( IF 3.5 ) Pub Date : 2021-01-06 , DOI: 10.1016/j.bdr.2020.100183
Iñigo Martinez , Elisabeth Viles , Igor G. Olaizola

Data science has employed great research efforts in developing advanced analytics, improving data models and cultivating new algorithms. However, not many authors have come across the organizational and socio-technical challenges that arise when executing a data science project: lack of vision and clear objectives, a biased emphasis on technical issues, a low level of maturity for ad-hoc projects and the ambiguity of roles in data science are among these challenges. Few methodologies have been proposed on the literature that tackle these type of challenges, some of them date back to the mid-1990, and consequently they are not updated to the current paradigm and the latest developments in big data and machine learning technologies. In addition, fewer methodologies offer a complete guideline across team, project and data & information management. In this article we would like to explore the necessity of developing a more holistic approach for carrying out data science projects. We first review methodologies that have been presented on the literature to work on data science projects and classify them according to the their focus: project, team, data and information management. Finally, we propose a conceptual framework containing general characteristics that a methodology for managing data science projects with a holistic point of view should have. This framework can be used by other researchers as a roadmap for the design of new data science methodologies or the updating of existing ones.



中文翻译:

数据科学方法论:当前的挑战和未来的方法

数据科学在开发高级分析,改进数据模型和开发新算法方面投入了大量的精力。但是,执行数据科学项目时遇到的组织和社会技术挑战的作者并不多:缺乏远见和明确的目标,偏重技术问题,临时项目的成熟度低以及这些挑战包括数据科学中角色的模糊性。文献中很少提出解决这类挑战的方法,其中一些方法可以追溯到1990年中期,因此它们没有更新到大数据和机器学习技术的最新范例和最新发展。此外,更少的方法可以提供有关团队,项目以及数据和信息管理的完整指南。在本文中,我们想探讨开发更全面的方法来执行数据科学项目的必要性。我们首先回顾文献中介绍的用于数据科学项目的方法,然后根据其重点对它们进行分类:项目,团队,数据和信息管理。最后,我们提出了一个概念框架,其中包含以整体观点来管理数据科学项目的方法应具有的一般特征。其他研究人员可以将该框架用作设计新数据科学方法或更新现有方法的路线图。我们首先回顾文献中介绍的用于数据科学项目的方法,然后根据其重点对它们进行分类:项目,团队,数据和信息管理。最后,我们提出了一个概念框架,其中包含以整体观点来管理数据科学项目的方法应具有的一般特征。其他研究人员可以将该框架用作设计新数据科学方法或更新现有方法的路线图。我们首先回顾文献中介绍的用于数据科学项目的方法,然后根据其重点对它们进行分类:项目,团队,数据和信息管理。最后,我们提出了一个概念框架,其中包含以整体观点来管理数据科学项目的方法应具有的一般特征。其他研究人员可以将该框架用作设计新数据科学方法或更新现有方法的路线图。

更新日期:2021-01-16
down
wechat
bug