当前位置: X-MOL 学术Library Hi Tech › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies
Library Hi Tech ( IF 1.623 ) Pub Date : 2021-08-24 , DOI: 10.1108/lht-01-2021-0051
Haoran Zhu 1 , Lei Lei 2
Affiliation  

Purpose

Previous research concerning automatic extraction of research topics mostly used rule-based or topic modeling methods, which were challenged due to the limited rules, the interpretability issue and the heavy dependence on human judgment. This study aims to address these issues with the proposal of a new method that integrates machine learning models with linguistic features for the identification of research topics.

Design/methodology/approach

First, dependency relations were used to extract noun phrases from research article texts. Second, the extracted noun phrases were classified into topics and non-topics via machine learning models and linguistic and bibliometric features. Lastly, a trend analysis was performed to identify hot research topics, i.e. topics with increasing popularity.

Findings

The new method was experimented on a large dataset of COVID-19 research articles and achieved satisfactory results in terms of f-measures, accuracy and AUC values. Hot topics of COVID-19 research were also detected based on the classification results.

Originality/value

This study demonstrates that information retrieval methods can help researchers gain a better understanding of the latest trends in both COVID-19 and other research areas. The findings are significant to both researchers and policymakers.



中文翻译:

一种基于依赖的机器学习方法来识别研究主题:COVID-19 研究中的一个案例

目的

以往关于自动提取研究主题的研究大多采用基于规则或主题建模的方法,但由于规则有限、可解释性问题和对人类判断的严重依赖而受到挑战。本研究旨在通过提出一种新方法来解决这些问题,该方法将机器学习模型与语言特征相结合,以识别研究主题。

设计/方法/方法

首先,依赖关系用于从研究文章文本中提取名词短语。其次,通过机器学习模型以及语言和文献计量特征将提取的名词短语分为主题和非主题。最后,进行趋势分析以确定热门研究课题,即越来越受欢迎的课题。

发现

新方法在 COVID-19 研究文章的大型数据集上进行了实验,在f测量值、准确性和 AUC 值方面取得了令人满意的结果。根据分类结果,还发现了 COVID-19 研究的热点话题。

原创性/价值

这项研究表明,信息检索方法可以帮助研究人员更好地了解 COVID-19 和其他研究领域的最新趋势。这些发现对研究人员和政策制定者都很重要。

更新日期:2021-08-24
down
wechat
bug