The promise of machine-learning- driven text analysis techniques for historical research: topic modeling and word embedding,Management & Organizational History

当前位置： X-MOL 学术 › Management & Organizational History › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The promise of machine-learning- driven text analysis techniques for historical research: topic modeling and word embedding
Management & Organizational History ( IF 1.303 ) Pub Date : 2023-03-01 , DOI: 10.1080/17449359.2023.2181184
Marta Villamor Martin ₁ , David A. Kirsch ₁ , Fabian Prieto-Nañez ₂

Affiliation

ABSTRACT

Building upon our experience implementing a mixed method study combining historical and topic modeling techniques to explore how institutional voids are resolved and their relationship to formal/informal markets, we describe the promise of Topic Modeling techniques for historical studies. Recent advancements – particularly improvements in artificial intelligence and machine learning techniques – have enabled the use of off-the-shelf AI to analyze and process large quantities of data. These techniques reduce research biases and some of the costs previously associated with computational text analysis techniques (i.e. corpus processing time and computational power). We highlight the usefulness of three text analysis techniques – structural topic modeling (STM), dynamic topic modeling (DTM), and word embeddings – and demonstrate their ability to support the generation of novel interpretations. Finally, we emphasize the continuing importance of the author in every step of the research process, especially for abstracting from AI outputs, evaluating competing explanations, inferring meaning, and building theory.

中文翻译：

机器学习驱动的文本分析技术对历史研究的前景：主题建模和词嵌入

摘要

基于我们实施结合历史和主题建模技术的混合方法研究的经验，以探索如何解决制度空白及其与正式/非正式市场的关系，我们描述了主题建模技术对历史研究的前景。最近的进步——特别是人工智能和机器学习技术的改进——使得使用现成的人工智能来分析和处理大量数据成为可能。这些技术减少了研究偏差和一些先前与计算文本分析技术相关的成本（即语料库处理时间和计算能力）。我们强调三种文本分析技术的有用性——结构主题建模（STM）、动态主题建模（DTM）、和词嵌入——并展示它们支持生成新颖解释的能力。最后，我们强调作者在研究过程的每一步中的持续重要性，特别是在从人工智能输出中抽象、评估相互竞争的解释、推断意义和构建理论方面。

更新日期：2023-03-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>