当前位置: X-MOL 学术Stat. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Active learning for efficiently training emulators of computationally expensive mathematical models.
Statistics in Medicine ( IF 1.8 ) Pub Date : 2020-08-11 , DOI: 10.1002/sim.8679
Alexandra G Ellis 1, 2 , Rowan Iskandar 1, 3 , Christopher H Schmid 1, 4 , John B Wong 5 , Thomas A Trikalinos 1
Affiliation  

An emulator is a fast‐to‐evaluate statistical approximation of a detailed mathematical model (simulator). When used in lieu of simulators, emulators can expedite tasks that require many repeated evaluations, such as sensitivity analyses, policy optimization, model calibration, and value‐of‐information analyses. Emulators are developed using the output of simulators at specific input values (design points). Developing an emulator that closely approximates the simulator can require many design points, which becomes computationally expensive. We describe a self‐terminating active learning algorithm to efficiently develop emulators tailored to a specific emulation task, and compare it with algorithms that optimize geometric criteria (random latin hypercube sampling and maximum projection designs) and other active learning algorithms (treed Gaussian Processes that optimize typical active learning criteria). We compared the algorithms' root mean square error (RMSE) and maximum absolute deviation from the simulator (MAX) for seven benchmark functions and in a prostate cancer screening model. In the empirical analyses, in simulators with greatly varying smoothness over the input domain, active learning algorithms resulted in emulators with smaller RMSE and MAX for the same number of design points. In all other cases, all algorithms performed comparably. The proposed algorithm attained satisfactory performance in all analyses, had smaller variability than the treed Gaussian Processes, and, on average, had similar or better performance as the treed Gaussian Processes in six out of seven benchmark functions and in the prostate cancer model.

中文翻译:

用于有效训练计算成本高的数学模型的仿真器的主动学习。

模拟器是详细数学模型(模拟器)的快速评估统计近似值。当使用模拟器代替模拟器时,模拟器可以加快需要多次重复评估的任务,例如敏感性分析、策略优化、模型校准和信息价值分析。仿真器是使用特定输入值(设计点)的仿真器输出开发的。开发与模拟器非常接近的模拟器可能需要许多设计点,这在计算上变得昂贵。我们描述了一种自终止主动学习算法,以有效地开发针对特定仿真任务量身定制的仿真器,并将其与优化几何标准的算法(随机拉丁超立方体采样和最大投影设计)和其他主动学习算法(优化典型主动学习标准的树形高斯过程)进行比较。我们在七个基准函数和前列腺癌筛查模型中比较了算法的均方根误差 (RMSE) 和与模拟器的最大绝对偏差 (MAX)。在实证分析中,在输入域平滑度变化很大的模拟器中,主动学习算法导致模拟器在相同数量的设计点下具有更小的 RMSE 和 MAX。在所有其他情况下,所有算法的性能相当。所提出的算法在所有分析中都取得了令人满意的性能,比树形高斯过程具有更小的可变性,并且平均而言,
更新日期:2020-10-02
down
wechat
bug