当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Approval policies for modifications to machine learning‐based software as a medical device: a study of bio‐creep
Biometrics ( IF 1.9 ) Pub Date : 2020-10-11 , DOI: 10.1111/biom.13379
Jean Feng 1 , Scott Emerson 1 , Noah Simon 1
Affiliation  

Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety. To date, the FDA approves locked algorithms prior to marketing and requires future updates to undergo separate premarket reviews. However, this negates a key feature of machine learning-the ability to learn from a growing dataset and improve over time. This paper frames the design of an approval policy, which we refer to as an automatic algorithmic change protocol (aACP), as an online hypothesis testing problem. As this process has obvious analogy with noninferiority testing of new drugs, we investigate how repeated testing and adoption of modifications might lead to gradual deterioration in prediction accuracy, also known as "biocreep" in the drug development literature. We consider simple policies that one might consider but do not necessarily offer any error-rate guarantees, as well as policies that do provide error-rate control. For the latter, we define two online error-rates appropriate for this context: Bad Approval Count (BAC) and Bad Approval and Benchmark Ratios (BABR). We control these rates in the simple setting of a constant population and data source using policies aACP-BAC and aACP-BABR, which combine alpha-investing, group-sequential, and gate-keeping methods. In simulation studies, bio-creep regularly occurred when using policies with no error-rate guarantees, whereas aACP-BAC and -BABR controlled the rate of bio-creep without substantially impacting our ability to approve beneficial modifications. This article is protected by copyright. All rights reserved.

中文翻译:

将基于机器学习的软件修改为医疗设备的批准政策:生物蠕变研究

在医疗保健中成功部署机器学习算法需要对其性能和安全性进行仔细评估。迄今为止,FDA 在上市前批准锁定算法,并要求未来的更新进行单独的上市前审查。然而,这否定了机器学习的一个关键特征——从不断增长的数据集中学习并随着时间的推移不断改进的能力。本文将审批政策的设计框架化,我们将其称为自动算法更改协议 (aACP),作为在线假设检验问题。由于这个过程与新药的非劣效性测试有明显的类比,我们研究了重复测试和修改的采用如何导致预测准确性逐渐恶化,在药物开发文献中也称为“生物蠕变”。我们考虑人们可能会考虑但不一定提供任何错误率保证的简单策略,以及确实提供错误率控制的策略。对于后者,我们定义了两个适合这种情况的在线错误率:不良批准计数(BAC)和不良批准和基准比率(BABR)。我们使用策略 aACP-BAC 和 aACP-BABR 在恒定人口和数据源的简单设置中控制这些比率,它们结合了 alpha 投资、组顺序和守门方法。在模拟研究中,当使用没有错误率保证的策略时,生物蠕变经常发生,而 aACP-BAC 和 -BABR 控制生物蠕变的速率,而不会显着影响我们批准有益修改的能力。本文受版权保护。版权所有。
更新日期:2020-10-11
down
wechat
bug