当前位置: X-MOL 学术Kybernetes › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Is data science a science? The essence of phenomenon and the role of theory in the emerging field
Kybernetes ( IF 2.5 ) Pub Date : 2021-06-22 , DOI: 10.1108/k-03-2021-0205
Pedro Jácome de Moura Jr

Purpose

Data science lacks a distinctive identity and a theory-informed approach, both for its own sake and to properly be applied conjointly to the social sciences. This paper’s purposes are twofold: to provide (1) data science an illustration of theory adoption, able to address explanation and support prediction/prescription capacities and (2) a rationale for identification of the key phenomena and properties of data science so that the data speak through a contextual understanding of reality, broader than has been usual.

Design/methodology/approach

A literature review and a derived conceptual research model for a push–pull approach (adapted for a data science study in the management field) are presented. A real location–allocation problem is solved through a specific algorithm and explained in the light of the adapted push–pull theory, serving as an instance for a data science theory-informed application in the management field.

Findings

This study advances knowledge on the definition of data science key phenomena as not just pure “data”, but interrelated data and datasets properties, as well as on the specific adaptation of the push-pull theory through its definition, dimensionality and interaction model, also illustrating how to apply the theory in a data science theory-informed research. The proposed model contributes to the theoretical strengthening of data science, still an incipient area, and the solution of the location-allocation problem suggests the applicability of the proposed approach to broad data science problems, alleviating the criticism on the lack of explanation and the focus on pattern recognition in data science practice and research.

Research limitations/implications

The proposed algorithm requires the previous definition of a perimeter of interest. This aspect should be characterised as an antecedent to the model, which is a strong assumption. As for prescription, in this specific case, one has to take complementary actions, since theory, model and algorithm are not detached from in loco visits, market research or interviews with potential stakeholders.

Practical implications

This study offers a conceptual model for practical location–allocation problem analyses, based on the push–pull theoretical components. So, it suggests a proper definition for each component (the object, the perspective, the forces, its degrees and the nature of the movement). The proposed model has also an algorithm for computational implementation, which visually describes and explains components interaction, allowing further simulation (estimated forces degrees) for prediction.

Originality/value

First, this study identifies an overlap of push–pull theoretical approaches, which suggests theory adoption eventually as mere common sense, weakening further theoretical development. Second, this study elaborates a definition for the push–pull theory, a dimensionality and a relationship between its components. Third, a typical location–allocation problem is analysed in the light of the refactored theory, showing its adequacy for that class of problems. And fourth, this study suggests that the essence of a data science should be the study of contextual relationships among data, and that the context should be provided by the spatial, temporal, political, economic and social analytical interests.



中文翻译:

数据科学是一门科学吗?现象的本质和理论在新兴领域的作用

目的

数据科学缺乏独特的身份和以理论为基础的方法,无论是为了它本身还是为了适当地结合应用于社会科学。本文的目的有两个:提供(1)数据科学对理论采用的说明,能够解决解释并支持预测/处方能力;(2)确定数据科学的关键现象和属性的基本原理,以便数据通过对现实的上下文理解来说话,比平常更广泛。

设计/方法/方法

介绍了推拉方法(适用于管理领域的数据科学研究)的文献综述和派生的概念研究模型。一个真实的位置分配问题是通过特定的算法解决的,并根据适应的推拉理论进行解释,作为数据科学理论在管理领域的应用实例。

发现

本研究推进了关于数据科学关键现象的定义的知识,不仅是纯粹的“数据”,而且是相互关联的数据和数据集属性,以及推挽理论通过其定义、维度和交互模型的具体适应,也说明如何将理论应用于数据科学理论指导的研究。所提出的模型有助于加强数据科学的理论,这仍然是一个初期领域,位置分配问题的解决表明所提出的方法适用于广泛的数据科学问题,减轻了对缺乏解释和关注的批评关于数据科学实践和研究中的模式识别。

研究限制/影响

所提出的算法需要感兴趣的周长的先前定义。这方面应该被描述为模型的前提,这是一个强有力的假设。至于处方,在这种特定情况下,人们必须采取补充行动,因为理论、模型和算法不会脱离本地访问、市场研究或与潜在利益相关者的访谈。

实际影响

本研究基于推拉理论组件,为实际位置分配问题分析提供了一个概念模型。因此,它为每个组成部分(物体、视角、力、它的度数和运动的性质)提出了正确的定义。所提出的模型还有一个用于计算实现的算法,它直观地描述和解​​释了组件的相互作用,允许进一步模拟(估计的力度)进行预测。

原创性/价值

首先,这项研究确定了推拉理论方法的重叠,这表明理论采用最终只是常识,削弱了进一步的理论发展。其次,本研究阐述了推拉理论的定义、维度及其组成部分之间的关​​系。第三,根据重构理论分析了一个典型的位置分配问题,显示了它对这类问题的充分性。第四,这项研究表明,数据科学的本质应该是研究数据之间的上下文关系,并且上下文应该由空间、时间、政治、经济和社会分析兴趣提供。

更新日期:2021-06-21
down
wechat
bug