当前位置: X-MOL 学术Psychological Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interrater reliability for multilevel data: A generalizability theory approach.
Psychological Methods ( IF 10.929 ) Pub Date : 2021-04-05 , DOI: 10.1037/met0000391
Debby Ten Hove 1 , Terrence D Jorgensen 1 , L Andries van der Ark 1
Affiliation  

Current interrater reliability (IRR) coefficients ignore the nested structure of multilevel observational data, resulting in biased estimates of both subject- and cluster-level IRR. We used generalizability theory to provide a conceptualization and estimation method for IRR of continuous multilevel observational data. We explain how generalizability theory decomposes the variance of multilevel observational data into subject-, cluster-, and rater-related components, which can be estimated using Markov chain Monte Carlo (MCMC) estimation. We explain how IRR coefficients for each level can be derived from these variance components, and how they can be estimated as intraclass correlation coefficients (ICC). We assessed the quality of MCMC point and interval estimates with a simulation study, and showed that small numbers of raters were the main source of bias and inefficiency of the ICCs. In a follow-up simulation, we showed that a planned missing data design can diminish most estimation difficulties in these conditions, yielding a useful approach to estimating multilevel interrater reliability for most social and behavioral research. We illustrated the method using data on student–teacher relationships. All software code and data used for this article is available on the Open Science Framework: https://osf.io/bwk5t/. (PsycInfo Database Record (c) 2021 APA, all rights reserved)

中文翻译:

多级数据的评估者间可靠性:一种普遍性理论方法。

当前的评估者间可靠性 (IRR) 系数忽略了多级观测数据的嵌套结构,导致对主题级和集群级 IRR 的估计存在偏差。我们利用泛化理论为连续多级观测数据的内部收益率提供了一种概念化和估计方法。我们解释了泛化性理论如何将多级观测数据的方差分解为与主题、集群和评估者相关的组件,这些组件可以使用马尔可夫链蒙特卡罗 (MCMC) 估计来估计。我们解释了如何从这些方差分量中推导出每个级别的 IRR 系数,以及如何将它们估计为类内相关系数 (ICC)。我们通过模拟研究评估了 MCMC 点和区间估计的质量,并表明,少数评估者是 ICC 偏见和低效率的主要来源。在后续模拟中,我们表明,计划中的缺失数据设计可以减少这些条件下的大多数估计困难,从而为大多数社会和行为研究估计多级评估者间的可靠性提供了一种有用的方法。我们使用有关师生关系的数据来说明该方法。本文使用的所有软件代码和数据均可在 Open Science Framework 上获得:https://osf.io/bwk5t/。(PsycInfo 数据库记录 (c) 2021 APA,保留所有权利)为大多数社会和行为研究提供了一种有用的方法来估计多级评估者间的可靠性。我们使用有关师生关系的数据来说明该方法。本文使用的所有软件代码和数据均可在 Open Science Framework 上获得:https://osf.io/bwk5t/。(PsycInfo 数据库记录 (c) 2021 APA,保留所有权利)为大多数社会和行为研究提供了一种有用的方法来估计多级评估者间的可靠性。我们使用有关师生关系的数据来说明该方法。本文使用的所有软件代码和数据均可在 Open Science Framework 上获得:https://osf.io/bwk5t/。(PsycInfo 数据库记录 (c) 2021 APA,保留所有权利)
更新日期:2021-04-05
down
wechat
bug