当前位置: X-MOL 学术Online Information Review › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling the co-citation dependence on semantic layers of co-cited documents
Online Information Review ( IF 3.1 ) Pub Date : 2021-05-12 , DOI: 10.1108/oir-04-2020-0126
Maryam Yaghtin 1 , Hajar Sotudeh 1 , Alireza Nikseresht 1 , Mahdieh Mirzabeigi 1
Affiliation  

Purpose

Co-citation frequency, defined as the number of documents co-citing two articles, is considered as a quantitative, and thus, an efficient proxy of subject relatedness or prestige of the co-cited articles. Despite its quantitative nature, it is found effective in retrieving and evaluating documents, signifying its linkage with the related documents' contents. To better understand the dynamism of the citation network, the present study aims to investigate various content features giving rise to the measure.

Design/methodology/approach

The present study examined the interaction of different co-citation features in explaining the co-citation frequency. The features include the co-cited works' similarities in their full-texts, Medical Subject Headings (MeSH) terms, co-citation proximity, opinions and co-citances. A test collection is built using the CITREC dataset. The data were analyzed using natural language processing (NLP) and opinion mining techniques. A linear model was developed to regress the objective and subjective content-based co-citation measures against the natural log of the co-citation frequency.

Findings

The dimensions of co-citation similarity, either subjective or objective, play significant roles in predicting co-citation frequency. The model can predict about half of the co-citation variance. The interaction of co-opinionatedness and non-co-opinionatedness is the strongest factor in the model.

Originality/value

It is the first study in revealing that both the objective and subjective similarities could significantly predict the co-citation frequency. The findings re-confirm the citation analysis assumption claiming the connection between the cognitive layers of cited documents and citation measures in general and the co-citation frequency in particular.

Peer review

The peer review history for this article is available at https://publons.com/publon/10.1108/OIR-04-2020-0126.



中文翻译:

对共引文档语义层的共引依赖建模

目的

共同被引频率,定义为共同引用两篇文章的文档数量,被认为是定量的,因此是共同被引文章的主题相关性或声望的有效代表。尽管它具有定量性质,但它在检索和评估文件方面是有效的,这表明它与相关文件的内容有联系。为了更好地了解引文网络的活力,本研究旨在调查引起该措施的各种内容特征。

设计/方法/方法

本研究检验了不同共引特征在解释共引频率方面的相互作用。这些特征包括共同被引作品在全文中的相似性、医学主题词 (MeSH) 术语、共同被引接近度、观点和共同引用。使用 CITREC 数据集构建测试集合。使用自然语言处理 (NLP) 和意见挖掘技术分析数据。开发了一个线性模型,将基于客观和主观内容的共引测量与共引频率的自然对数进行回归。

发现

共引相似度的维度,无论是主观的还是客观的,都在预测共引频率方面发挥着重要作用。该模型可以预测大约一半的共被引方差。合作与非合作的相互作用是模型中最强的因素。

原创性/价值

这是第一项揭示客观和主观相似性都可以显着预测共被引频率的研究。研究结果再次证实了引文分析假设,即被引文献的认知层与引文测量之间的联系,特别是共引频率之间的联系。

同行评审

本文的同行评审历史可在 https://publons.com/publon/10.1108/OIR-04-2020-0126 获得。

更新日期:2021-05-12
down
wechat
bug