当前位置: X-MOL 学术Psychological Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interrater reliability for multilevel data: A generalizability theory approach.
Psychological Methods ( IF 10.929 ) Pub Date : 2021-04-05 , DOI: 10.1037/met0000391
Debby Ten Hove 1 , Terrence D Jorgensen 1 , L Andries van der Ark 1
Affiliation  

Current interrater reliability (IRR) coefficients ignore the nested structure of multilevel observational data, resulting in biased estimates of both subject- and cluster-level IRR. We used generalizability theory to provide a conceptualization and estimation method for IRR of continuous multilevel observational data. We explain how generalizability theory decomposes the variance of multilevel observational data into subject-, cluster-, and rater-related components, which can be estimated using Markov chain Monte Carlo (MCMC) estimation. We explain how IRR coefficients for each level can be derived from these variance components, and how they can be estimated as intraclass correlation coefficients (ICC). We assessed the quality of MCMC point and interval estimates with a simulation study, and showed that small numbers of raters were the main source of bias and inefficiency of the ICCs. In a follow-up simulation, we showed that a planned missing data design can diminish most estimation difficulties in these conditions, yielding a useful approach to estimating multilevel interrater reliability for most social and behavioral research. We illustrated the method using data on student-teacher relationships. All software code and data used for this article is available on the Open Science Framework: https://osf.io/bwk5t/. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

中文翻译:

多级数据的评估者间可靠性:一种泛化理论方法。

当前的人际可靠性(IRR)系数忽略了多级观测数据的嵌套结构,从而导致对主体级和集群级IRR的估计都有偏差。我们使用概化理论为连续多级观测数据的IRR提供了一种概念化和估计方法。我们解释了概化理论如何将多级观测数据的方差分解为与主题,聚类和评估者相关的组件,可以使用马尔可夫链蒙特卡洛(MCMC)估计来估计。我们解释了如何从这些方差分量中导出每个级别的IRR系数,以及如何将其估算为类内相关系数(ICC)。我们通过模拟研究评估了MCMC点和间隔估计的质量,并表明,少数评分者是ICC偏见和效率低下的主要根源。在后续模拟中,我们表明计划的数据丢失设计可以减少这些情况下的大多数估计困难,从而为大多数社会和行为研究提供了一种有用的方法来估计多层次间的可靠性。我们使用学生与教师关系的数据说明了该方法。本文使用的所有软件代码和数据都可以在Open Science Framework上找到:https://osf.io/bwk5t/。(PsycInfo数据库记录(c)2021 APA,保留所有权利)。为大多数社会和行为研究提供了一种有用的方法来估计多层次的界面可靠性。我们使用学生与教师关系的数据说明了该方法。本文使用的所有软件代码和数据都可以在Open Science Framework上找到:https://osf.io/bwk5t/。(PsycInfo数据库记录(c)2021 APA,保留所有权利)。为大多数社会和行为研究提供了一种有用的方法来估计多层次的界面可靠性。我们使用学生与教师关系的数据说明了该方法。本文使用的所有软件代码和数据都可以在Open Science Framework上找到:https://osf.io/bwk5t/。(PsycInfo数据库记录(c)2021 APA,保留所有权利)。
更新日期:2021-04-05
down
wechat
bug