Predicting access mode of multidisciplinary and library and information sciences journals using machine learning,COLLNET Journal of Scientometrics and Information Management

当前位置： X-MOL 学术 › COLLNET J. Scientom. Inf. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Predicting access mode of multidisciplinary and library and information sciences journals using machine learning
COLLNET Journal of Scientometrics and Information Management Pub Date : 2022-07-27
Hilary I. Okagbue, Chinyere A. Nzeadibe, Jaime A. Teixeira da Silva

Academics and librarians might want to identify whether a journal is open access (OA) or subscription-based. While indexes and digital libraries might provide such information for known collections, it is possible that the access mode of a journal or body of journals might be unknown a priori. In this short analysis, a machine learning-based method is used to classify a journal’s access mode, OA or subscription, using its CiteScore and Journal Impact Factor (JIF). Using an initial pool of 91 multidisciplinary journals with a CiteScore, 38 journals with both a JIF and a CiteScore were selected (24 = OA; 14 = subscription). Using a data mining tool (Orange), ten machine learning models were applied (k nearest neighbor (kNN), Tree, support vector machine (SVM), Random forest, Neural network, Naïve Bayes, Logistic regression, Adaptive boosting (Adaboost)), Gradient Boosting (Scikit-learn) (GBS) and Gradient Boosting (catboost) (GBC). Adaboost, GBS and GBC showed the highest (100%) precision, sensitivity, and specificity. The 3 models correctly classify the access mode with zero error. The 3 optimum models were validated using then to predict the access mode of 54 (7 = OA; 47 = subscription) library and information science (LIS) journals and Adaboost and GBS gave perfect results with no misclassification. With these model, the access mode of multidisciplinary and LIS journals can be accurately and correctly predicted using only JIF-CiteScore data. Libraries in low-resource settings will benefit from the implementation of this research by designing a decision support system for the selection of journals.

中文翻译：

使用机器学习预测多学科和图书馆与信息科学期刊的访问模式

学者和图书馆员可能想要确定期刊是开放获取 (OA) 还是基于订阅的。虽然索引和数字图书馆可能会为已知馆藏提供此类信息，但期刊或期刊体的访问模式可能是先验未知的. 在这个简短的分析中，基于机器学习的方法用于使用期刊的 CiteScore 和期刊影响因子 (JIF) 对期刊的访问模式、OA 或订阅进行分类。使用包含 CiteScore 的 91 种多学科期刊的初始库，选择了 38 种同时具有 JIF 和 CiteScore 的期刊（24 = OA；14 = 订阅）。使用数据挖掘工具（橙色），应用了十个机器学习模型（k 最近邻 (kNN)、树、支持向量机 (SVM)、随机森林、神经网络、朴素贝叶斯、逻辑回归、自适应提升 (Adaboost)），梯度提升（Scikit-learn）（GBS）和梯度提升（catboost）（GBC）。Adaboost、GBS 和 GBC 显示出最高 (100%) 的精度、灵敏度和特异性。3个模型正确分类访问模式，零错误。验证了 3 个最佳模型，然后使用 then 来预测 54（7 = OA；47 = 订阅）图书馆和信息科学 (LIS) 期刊的访问模式，Adaboost 和 GBS 给出了完美的结果，没有错误分类。通过这些模型，仅使用 JIF-CiteScore 数据就可以准确、正确地预测多学科和 LIS 期刊的访问模式。通过设计用于期刊选择的决策支持系统，资源匮乏地区的图书馆将从这项研究的实施中受益。

更新日期：2022-07-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>