当前位置: X-MOL 学术Big Data & Society › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Turning biases into hypotheses through method: A logic of scientific discovery for machine learning
Big Data & Society ( IF 6.5 ) Pub Date : 2021-05-30 , DOI: 10.1177/20539517211020775
Simon Aagaard Enni 1 , Maja Bak Herrie 2
Affiliation  

Machine learning (ML) systems have shown great potential for performing or supporting inferential reasoning through analyzing large data sets, thereby potentially facilitating more informed decision-making. However, a hindrance to such use of ML systems is that the predictive models created through ML are often complex, opaque, and poorly understood, even if the programs “learning” the models are simple, transparent, and well understood. ML models become difficult to trust, since lay-people, specialists, and even researchers have difficulties gauging the reasonableness, correctness, and reliability of the inferences performed. In this article, we argue that bridging this gap in the understanding of ML models and their reasonableness requires a focus on developing an improved methodology for their creation. This process has been likened to “alchemy” and criticized for involving a large degree of “black art,” owing to its reliance on poorly understood “best practices”. We soften this critique and argue that the seeming arbitrariness often is the result of a lack of explicit hypothesizing stemming from an empiricist and myopic focus on optimizing for predictive performance rather than from an occult or mystical process. We present some of the problems resulting from the excessive focus on optimizing generalization performance at the cost of hypothesizing about the selection of data and biases. We suggest embedding ML in a general logic of scientific discovery similar to the one presented by Charles Sanders Peirce, and present a recontextualized version of Peirce’s scientific hypothesis adjusted to ML.



中文翻译:

通过方法将偏见转化为假设:机器学习的科学发现逻辑

机器学习 (ML) 系统通过分析大型数据集显示出执行或支持推理推理的巨大潜力,从而有可能促进更明智的决策。然而,这种使用 ML 系统的一个障碍是,通过 ML 创建的预测模型通常很复杂、不透明且难以理解,即使“学习”模型的程序简单、透明且易于理解。ML 模型变得难以信任,因为外行、专家甚至研究人员都难以衡量所执行推理的合理性、正确性和可靠性。在本文中,我们认为,要缩小对ML模型及其合理性的理解的差距,就需要集中精力开发一种改进的创建方法。这个过程被比作“炼金术”,并被批评为涉及大量的“黑色艺术”,因为它依赖于知之甚少的“最佳实践”。我们软化了这种批评,并认为看似任意性通常是缺乏明确假设的结果,这种假设源于经验主义和短视,专注于优化预测性能,而不是来自神秘或神秘的过程。我们提出了一些由于过度关注优化泛化性能而导致的一些问题,而代价是假设数据的选择和偏差。我们建议将 ML 嵌入到类似于 Charles Sanders Peirce 提出的科学发现的一般逻辑中,并提出针对 ML 调整的 Peirce 科学假设的重新语境化版本。

更新日期:2021-05-30
down
wechat
bug