Novel quality assessment methodology in focused cardiac ultrasound,Academic Emergency Medicine

当前位置： X-MOL 学术 › Acad. Emerg. Med. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Novel quality assessment methodology in focused cardiac ultrasound
Academic Emergency Medicine ( IF 4.4 ) Pub Date : 2022-07-17 , DOI: 10.1111/acem.14562
Stephanie DeMasi ₁ , Lindsay A Taylor ₁ , Adam Weltler ₁ , John C Wiggins ₁ , Jacob Wayman ₁ , Chen Wang ₂ , David P Evans ₁ , Jessica R Balderston ₁

Affiliation

Cardiac point-of-care ultrasound (POCUS) has become a fundamental component of the evaluation of patients in the emergency department (ED) to diagnose cardiac pathology.¹ Quality assessment (QA) is one of the six required elements of diagnostic POCUS examinations per the American College of Emergency Physicians (ACEP).² QA is routinely performed to ensure standard of patient care is met and to assess for competency, particularly at the trainee level. It has recently been described that 82% of EM residency programs report use of QA as an assessment tool.³ It is imperative to provide a reliable scoring system to limit inconsistencies in the way we are measuring clinical skill as it relates to ultrasound.

The current grading scale used for QA that is endorsed by ACEP was developed from a consensus report of emergency ultrasound leaders indicating a need for a systematic method to report and communicate POCUS findings.² The ACEP grading scale is a nonspecific grading classification that applies regardless of the type of study performed.² This contrasts with other QA grading systems that have been described in an organ-specific manner. Examples of these systems include focused cardiac ultrasound assessment demonstrated by Kimura et al.⁴ and focused gynecological emergency ultrasound examination by Salomon.⁵ The goal of this study was to determine whether a similar, organ-specific grading scale would be a more reliable method of assessment with improved interobserver agreement. Furthermore, we sought to determine whether an organ-specific grading scale had more variance in the scores that are chosen and therefore more focused feedback to the sonographers performing the studies.

We conducted a prospective analysis of the first 200 cardiac POCUS studies performed in our ED that were submitted for QA into our image database in the year of 2020. Four reviewers, who were either emergency ultrasound fellowship trained or current emergency ultrasound fellows with at least 9 months of QA experience, scored each of the studies. Two reviewers used the current ACEP grading scale: 1 = no recognizable structures; 2 = minimally recognizable structures but insufficient for diagnosis; 3 = minimal criteria met for diagnosis, recognizable structures but with some technical or other flaws; 4 = minimal criteria met for diagnosis, all structures imaged well; and 5 = minimal criteria met for diagnosis, all structures imaged with excellent image quality.² The two other reviewers used a cardiac-specific grading scale as previous described by Kimura et al.: 0 = no image obtained; 1 = only cardiac motion detected; 2 = chambers and valves grossly resolved; 3 = endocardium and wall thickness seen but incomplete; and 4 = greater than 90% of endocardium and valve motion seen.³ The primary outcomes were the level of agreement between the reviewers, indicating the reliability of the scoring system, and the variability of the scores given to the studies. This study was approved by the VCU institutional review board.

The ACEP score was on a scale of 1–5, while the organ-specific score was on a scale of 0–4. For equal comparison, we added 1 to the organ-specific grading. For the primary outcome, the intraclass correlation coefficient (ICC) based on two-way random-effect model with a single rater for each grading scale was computed. Ten thousand bootstrapped ICCs were generated to construct 95% confidence intervals (CIs) for both grading systems, and a two-sided one-sample t-test was used to determine if there were differences in the bootstrapped ICCs between the two grading systems. The ICC between reviewers for the ACEP grading scale was 0.54 (95% CI 0.410–0.555) indicating moderate agreement, while the ICC between reviewers using the organ-specific grading scale was 0.75 (95% CI 0.600–0.769) indicating good agreement. These findings were statistically significant with one-sample t-test p-value of <0.0001. The ACEP grading scale mean (±SD) was 3.15 (±0.693) versus 4.16 (±0.967) for the organ-specific grading system. A 95% CI for the variance ratio was constructed to determine whether there were differences in the variability between the two grading systems. The variance of scores using the organ-specific grading scale was found to be more than 1.95 times greater when compared to the scores using ACEP grading scale. The variance among each group of scores using a 95% CI of the bootstrapped variance ratio (1.49–2.51), the organ-specific grading scale was found to have significantly more variability than the ACEP scoring system. The summary figure for the ACEP and organ-specific grading scale are found in Figure 1.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewerPowerPoint

Scores given to studies by grading scale

This study is a quantitative prospective analysis comparing the current ACEP recommended QA method to an organ-specific method. We found that there was an increased interobserver agreement between reviewers and increased variability of the scores given to the studies when using the organ-specific method compared to the ACEP method. This suggests that added detail in the guidance to the reviewers led to increased use of the spectrum of the scores. This is particularly important for trainees; they are often the performers of the ultrasound and, as such, the recipient of the QA reports. More direct feedback may expand their knowledge base to be implemented in future studies. For example, the top score on the ACEP grading scale states in a nonspecific manner, “all structures imaged with excellent image quality” versus the cardiac organ-specific grading scale states “greater than 90% of endocardium and valve motion seen.” This allows the learner to understand why the top score was given and the structures needed to meet this requirement. The opposite is also true, the organ-specific grading scale presents a detailed response and as such suggests direction for improvement on subsequent studies. The use of a standardized focused scoring system may also improve the quality of the images themselves. Salomon⁴ demonstrated that the implementation of an organ specific gynecological emergency ultrasound may improve the quality level of ultrasound examinations performed.

Our data highlight the importance of QA methodology that promotes objectivity demonstrated by high interobserver agreement. POCUS examinations in the ED continue to rise, and as they do there is a need for a standardized method to perform QA that has been thoroughly validated prior to implementation. This single-center study is limited by the images being obtained and reviewed at an academic ED by ultrasound-trained faculty limiting the generalizability.

This investigation demonstrates an organ-specific quality assurance of focused cardiac ultrasound procedures performed in the ED may give the providers performing the studies a wider range of feedback with increased inter-rater agreement among those doing the reviews. Given these findings, there may be a benefit to moving toward an organ-specific QA scale for POCUS studies obtained in the ED.

中文翻译：

聚焦心脏超声的新型质量评估方法

心脏床旁超声 (POCUS) 已成为急诊科 (ED) 评估患者以诊断心脏病理的基本组成部分。¹根据美国急诊医师学会 (ACEP) 的规定，质量评估 (QA) 是诊断性 POCUS 检查的六个必需要素之一。²定期执行 QA 以确保达到患者护理标准并评估能力，尤其是在受训者级别。最近有报道称，82% 的 EM 住院医师项目报告使用 QA 作为评估工具。³必须提供可靠的评分系统，以限制我们衡量与超声相关的临床技能的方式不一致。

目前由 ACEP 认可的用于 QA 的分级量表是根据紧急超声领导者的共识报告制定的，该报告表明需要一种系统的方法来报告和交流 POCUS 结果。² ACEP 分级量表是一种非特异性分级分类，适用于任何类型的研究。²这与其他以器官特异性方式描述的 QA 分级系统形成对比。这些系统的示例包括 Kimura 等人展示的聚焦心脏超声评估。⁴、重点妇科急诊超声检查经Salomon。^5个本研究的目的是确定一个类似的、特定于器官的分级量表是否是一种更可靠的评估方法，并提高了观察者间的一致性。此外，我们试图确定特定器官的分级量表是否在所选分数中具有更大的差异，因此对执行研究的超声医师的反馈更集中。

我们对 2020 年在我们的急诊室进行的前 200 项心脏 POCUS 研究进行了前瞻性分析，这些研究提交给我们的图像数据库进行质量检查。四名审查员，他们要么接受过急诊超声进修培训，要么是目前至少有 9 名急诊超声进修生几个月的质量保证经验，对每项研究进行评分。两位评审员使用了当前的 ACEP 分级量表：1 = 没有可识别的结构；2 = 最低限度可识别的结构，但不足以进行诊断；3 = 满足诊断的最低标准，结构可识别，但有一些技术或其他缺陷；4 = 满足诊断的最低标准，所有结构成像良好；和 5 = 满足诊断的最低标准，所有结构都具有出色的图像质量。^2个其他两位评审员使用了 Kimura 等人先前描述的心脏特定分级量表：0 = 未获得图像；1 = 仅检测到心脏运动；2 = 腔室和阀门粗略分辨；3 = 可见心内膜和壁厚但不完整；4 = 看到大于 90% 的心内膜和瓣膜运动。³主要结果是评价者之间的一致程度，表明评分系统的可靠性，以及给予研究的分数的可变性。这项研究得到了 VCU 机构审查委员会的批准。

ACEP 评分为 1-5 分，而器官特异性评分为 0-4 分。为了进行平等比较，我们在器官特异性分级中加了 1 分。对于主要结果，计算了基于双路随机效应模型的组内相关系数 (ICC)，每个评分量表都有一个评分者。生成了一万个自举 ICC 以构建两个分级系统的 95% 置信区间 (CI)，并使用双侧单样本 t 检验来确定两个分级系统之间的自举 ICC 是否存在差异。ACEP 分级量表的审稿人之间的 ICC 为 0.54（95% CI 0.410-0.555），表明一致性适中，而使用器官特异性分级量表的审稿人之间的 ICC 为 0.75（95% CI 0.600-0.769），表明一致性良好。这些发现具有统计学意义，单样本 t 检验 p 值 <0.0001。ACEP 分级量表平均值 (±SD) 为 3.15 (±0.693)，而器官特异性分级系统为 4.16 (±0.967)。构建方差比的 95% CI 以确定两个分级系统之间的变异性是否存在差异。与使用 ACEP 分级量表的分数相比，使用器官特异性分级量表的分数方差要大 1.95 倍以上。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。0001. 器官特异性分级系统的 ACEP 分级量表平均值 (±SD) 为 3.15 (±0.693) 与 4.16 (±0.967)。构建方差比的 95% CI 以确定两个分级系统之间的变异性是否存在差异。与使用 ACEP 分级量表的分数相比，使用器官特异性分级量表的分数方差要大 1.95 倍以上。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。0001. 器官特异性分级系统的 ACEP 分级量表平均值 (±SD) 为 3.15 (±0.693) 与 4.16 (±0.967)。构建方差比的 95% CI 以确定两个分级系统之间的变异性是否存在差异。与使用 ACEP 分级量表的分数相比，使用器官特异性分级量表的分数方差要大 1.95 倍以上。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。构建方差比的 95% CI 以确定两个分级系统之间的变异性是否存在差异。与使用 ACEP 分级量表的分数相比，使用器官特异性分级量表的分数方差要大 1.95 倍以上。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。构建方差比的 95% CI 以确定两个分级系统之间的变异性是否存在差异。与使用 ACEP 分级量表的分数相比，使用器官特异性分级量表的分数方差要大 1.95 倍以上。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。与使用 ACEP 评分量表的分数相比，高出 95 倍。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。与使用 ACEP 评分量表的分数相比，高出 95 倍。使用自举方差比 (1.49–2.51) 的 95% CI，每组分数之间的方差被发现器官特异性分级量表比 ACEP 评分系统具有更大的可变性。ACEP 和器官特异性分级量表的汇总图见图 1。

详细信息在图片后面的标题中 — 图1
在图窗查看器中打开微软幻灯片软件

按等级量表给研究打分

本研究是一项定量前瞻性分析，将当前 ACEP 推荐的 QA 方法与器官特异性方法进行了比较。我们发现，与 ACEP 方法相比，使用器官特异性方法时，审稿人之间的观察者间一致性增加，研究得分的可变性增加。这表明，在对审稿人的指导中增加细节会导致分数范围的使用增加。这对受训者尤为重要；他们通常是超声波的执行者，因此也是 QA 报告的接收者。更直接的反馈可能会扩大他们的知识库，以便在未来的研究中实施。例如，ACEP 评分量表的最高分以非特定方式表示，“所有结构均以出色的图像质量成像”与心脏器官特异性分级量表的对比表明“超过 90% 的心内膜和瓣膜运动可见”。这使学习者能够理解为什么给出最高分以及满足此要求所需的结构。反之亦然，器官特异性分级量表给出了详细的反应，因此为后续研究指明了改进方向。使用标准化的重点评分系统也可以提高图像本身的质量。所罗门器官特异性分级量表给出了详细的反应，因此为后续研究提出了改进方向。使用标准化的重点评分系统也可以提高图像本身的质量。所罗门器官特异性分级量表给出了详细的反应，因此为后续研究提出了改进方向。使用标准化的重点评分系统也可以提高图像本身的质量。所罗门⁴表明，实施器官特定的妇科急诊超声可以提高超声检查的质量水平。

我们的数据强调了 QA 方法的重要性，该方法促进了观察者间高度一致所证明的客观性。ED 中的 POCUS 检查继续增加，并且随着它们的增加，需要一种标准化的方法来执行 QA，该方法在实施之前已经过彻底验证。这项单中心研究受到受过超声培训的教师在学术 ED 获取和审查的图像的限制，限制了普遍性。

这项调查表明，在 ED 中执行的聚焦心脏超声程序的器官特定质量保证可能会为执行研究的提供者提供更广泛的反馈，并增加进行审查的评估者之间的一致性。鉴于这些发现，对于在 ED 中获得的 POCUS 研究，转向器官特异性 QA 量表可能是有益的。

更新日期：2022-07-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南