Comparison of “Human” and Artificial Intelligence Hand-and-Wrist Skeletal Age Estimation in an Epiphysiodesis Cohort,The Journal of Bone & Joint Surgery

当前位置： X-MOL 学术 › J. Bone Joint. Surg. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Comparison of “Human” and Artificial Intelligence Hand-and-Wrist Skeletal Age Estimation in an Epiphysiodesis Cohort
The Journal of Bone & Joint Surgery ( IF 5.3 ) Pub Date : 2023-02-01 , DOI: 10.2106/jbjs.22.00833
Dylan G Kluck ₁ , Marina R Makarov , Yassine Kanaan , Chan-Hee Jo , John G Birch

Affiliation

Background:

We previously demonstrated that the White-Menelaus arithmetic formula combined with skeletal age as estimated with the Greulich and Pyle (GP) atlas was the most accurate method for predicting leg lengths and residual leg-length discrepancy (LLD) at maturity in a cohort of patients treated with epiphysiodesis. We sought to determine if an online artificial intelligence (AI)-based hand-and-wrist skeletal age system provided consistent readings and to evaluate how these readings influenced the prediction of the outcome of epiphysiodesis in this cohort.

Methods:

JPEG images of perioperative hand radiographs for 76 subjects were independently submitted by 2 authors to an AI skeletal age web site (http://physis.16bit.ai/). We compared the accuracy of the predicted long-leg length (after epiphysiodesis), short-leg length, and residual LLD with use of the White-Menelaus formula and either human-estimated GP or AI-estimated skeletal age.

Results:

The AI skeletal age readings had an intraclass correlation coefficient (ICC) of 0.99. AI-estimated skeletal age was generally greater than human-estimated GP skeletal age (average, 0.5 year greater in boys and 0.1 year greater in girls). Overall, the prediction accuracy was improved with AI readings; these differences reached significance for the short-leg and residual LLD prediction errors. Residual LLD was underestimated by ≥1.0 cm in 26 of 76 subjects when human-estimated GP skeletal age was used (range of underestimation, 1.0 to 3.2 cm), compared with only 10 of 76 subjects when AI skeletal age was used (range of underestimation, 1.1 cm to 2.2 cm) (p < 0.01). Residual LLD was overestimated by ≥1.0 cm in 3 of 76 subjects by both methods (range of overestimation, 1.0 to 1.3 cm for the human-estimated GP method and 1.0 to 1.6 cm for the AI method).

Conclusions:

The AI method of determining hand-and-wrist skeletal age was highly reproducible in this cohort and improved the accuracy of prediction of leg length and residual discrepancy when compared with traditional human interpretation of the GP atlas. This improvement could be explained by more accurate estimation of skeletal age via a machine-learning AI system calibrated with a large database.

Level of Evidence:

Prognostic Level III. See Instructions for Authors for a complete description of levels of evidence.

中文翻译：

“人类”与人工智能手腕骨骼年龄估计在骨骺分离队列中的比较

背景：

我们之前证明，White-Menelaus 算术公式结合 Greulich 和 Pyle (GP) 图集估计的骨骼年龄是预测一组患者成熟时腿长和剩余腿长差异 (LLD) 的最准确方法用骨骺固定术治疗。我们试图确定基于在线人工智能 (AI) 的手腕骨骼年龄系统是否提供了一致的读数，并评估这些读数如何影响对该队列骨骺分离结果的预测。

方法：

76 名受试者围手术期手部 X 光片的 JPEG 图像由 2 位作者独立提交至 AI 骨骼年龄网站 (http://physis.16bit.ai/)。我们使用 White-Menelaus 公式和人类估计的 GP 或 AI 估计的骨骼年龄比较了预测的长腿长度（骨骺分离后）、短腿长度和残余 LLD 的准确性。

结果：

AI 骨龄读数的组内相关系数 (ICC) 为 0.99。AI 估计的骨龄通常大于人类估计的 GP 骨龄（男孩平均大 0.5 岁，女孩大 0.1 岁）。总体而言，人工智能读数提高了预测准确性；这些差异对短腿和残余 LLD 预测误差具有重要意义。当使用人类估计的 GP 骨龄时，76 名受试者中有 26 名受试者的残余 LLD 被低估≥1.0 cm（低估范围，1.0 至 3.2 cm），而当使用 AI 骨龄时，76 名受试者中只有 10 名受试者（低估范围） , 1.1 厘米至 2.2 厘米）（p < 0.01）。通过两种方法，76 名受试者中有 3 名受试者的残余 LLD 被高估了 ≥ 1.0 cm（高估范围，人类估计 GP 方法为 1.0 至 1.3 cm，1.0 至 1.