Inter-rater reliability of sleep stage scoring: a meta-analysis,Journal of Clinical Sleep Medicine

当前位置： X-MOL 学术 › J. Clin. Sleep Med. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inter-rater reliability of sleep stage scoring: a meta-analysis
Journal of Clinical Sleep Medicine ( IF 3.5 ) Pub Date : 2021-07-26 , DOI: 10.5664/jcsm.9538
Yun Ji Lee ₁ , Jae Yong Lee ₁ , Jae Hoon Cho ₂ , Ji Ho Choi ₁

Affiliation

Study Objectives:

We evaluated the inter-rater reliabilities of manual polysomnography (PSG) sleep stage scoring. We included all studies that employed Rechtschaffen and Kales (R&K) rules or American Academy of Sleep Medicine (AASM) standards. We sought the overall degree of agreement and those for each stage.

Methods:

The keywords were PSG, sleep staging, R&K, AASM, inter-rater (interscorer) reliability, and Cohen’s kappa. We searched PubMed, OVID Medline, EMBASE, the Cochrane library, KoreaMed, KISS, and the MedRIC. The exclusion criteria included automatic scoring and pediatric patients. We collected data on scorer histories, scoring rules, numbers of epochs scored, and the underlying diseases of the subjects.

Results:

A total of 101 publications were retrieved; 11 satisfied the selection criteria. The Cohen’s kappa for manual, overall sleep scoring was 0.76, indicating substantial agreement (95% confidence interval 0.71 to 0.81, p < 0.001). By sleep stage, the figures were 0.70, 0.24, 0.57, 0.57 and 0.69 for the W, N1, N2, N3 and R stages, respectively. The inter-rater reliabilities for stages N2 and N3 were moderate, and that for stage N1 only fair.

Conclusions:

We conducted a meta-analysis to generalize the variation in manual scoring of PSG and provide reference data for automatic sleep stage scoring systems. The reliability of manual scorers of PSG sleep stages was substantial. However, for certain stages, the results were poor; validity requires improvement.

中文翻译：

睡眠阶段评分的评估者间可靠性：荟萃分析

学习目标：

我们评估了手动多导睡眠图 (PSG) 睡眠阶段评分的评分者间可靠性。我们纳入了所有采用 Rechtschaffen 和 Kales (R&K) 规则或美国睡眠医学会 (AASM) 标准的研究。我们寻求整体的一致程度以及每个阶段的一致程度。

方法：

关键词是 PSG、睡眠分期、R&K、AASM、评分者间（interscorer）可靠性和 Cohen's kappa。我们检索了 PubMed、OVID Medline、EMBASE、Cochrane 图书馆、KoreaMed、KISS 和 MedRIC。排除标准包括自动评分和儿科患者。我们收集了有关评分者历史、评分规则、评分时期数以及受试者潜在疾病的数据。

结果：

共检索到出版物101篇； 11 人符合选择标准。手动整体睡眠评分的 Cohen kappa 为 0.76，表明基本一致（95% 置信区间 0.71 至 0.81，p < 0.001）。按睡眠阶段划分，W、N1、N2、N3 和 R 阶段的数字分别为 0.70、0.24、0.57、0.57 和 0.69。 N2 和 N3 阶段的评估者间信度为中等，N1 阶段的评估者间信度一般。