当前位置: X-MOL 学术Scientometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Validation of the Astro dataset clustering solutions with external data
Scientometrics ( IF 3.5 ) Pub Date : 2020-11-21 , DOI: 10.1007/s11192-020-03780-3
Paul Donner

We conduct an independent cluster validation study on published clustering solutions of a research testbed corpus, the Astro dataset of publication records from astronomy and astrophysics. We extend the dataset by collecting external validation data serving as proxies for the latent structure of the corpus. Specifically, we collect (1) grant funding information related to the publications, (2) data on topical special issues, (3) on specific journals’ internal topic classifications and (4) usage data from the main online bibliographic database of the discipline. The latter three types of data are newly introduced for the purpose of clustering validation and the rationale for using them for this task is set out. We find that one solution based on the global citation network achieves better results than the competitors across three validation data sources but that another solution based on bibliographic coupling performs best on the special issues data.

中文翻译:

使用外部数据验证 Astro 数据集聚类解决方案

我们对研究测试台语料库(天文学和天体物理学出版记录的 Astro 数据集)的已发布聚类解决方案进行了独立的聚类验证研究。我们通过收集外部验证数据作为语料库潜在结构的代理来扩展数据集。具体来说,我们收集 (1) 与出版物相关的拨款资助信息,(2) 专题特刊的数据,(3) 特定期刊内部专题分类的数据,以及 (4) 来自该学科主要在线书目数据库的使用数据。后三种类型的数据是为了聚类验证而新引入的,并阐述了将它们用于此任务的基本原理。
更新日期:2020-11-21
down
wechat
bug