Artificial Intelligence in Medicine ( IF 7.5 ) Pub Date : 2021-04-22 , DOI: 10.1016/j.artmed.2021.102080 W Qi 1 , A Abu-Hanna 2 , T E M van Esch 3 , D de Beurs 4 , Y Liu 5 , L E Flinterman 3 , M C Schut 2
Objectives
Individuals may respond differently to the same treatment, and there is a need to understand such heterogeneity of causal individual treatment effects. We propose and evaluate a modelling approach to better understand this heterogeneity from observational studies by identifying patient subgroups with a markedly deviating response to treatment. We illustrate this approach in a primary care case-study of antibiotic (AB) prescription on recovery from acute rhino-sinusitis (ARS).
Methods
Our approach consists of four stages and is applied to a large dataset in primary care dataset of 24,392 patients suspected of suffering from ARS. We first identify pre-treatment variables that either confound the relationship between treatment and outcome or are risk factors of the outcome. Second, based on the pre-treatment variables we create Synthetic Random Forest (SRF) models to compute the potential outcomes and subsequently the causal individual treatment effect (ITE) estimates. Third, we perform subgroup discovery using the ITE estimates as outcomes to identify positive and negative responders. Fourth, we evaluate the predictive performance of the identified subgroups for predicting the outcome in two ways: the likelihood ratio test, and whether the subgroups are selected via the Akaike Information Criterion (AIC) using backward stepwise variable selection. We validate the whole modelling strategy by means of 10-fold-cross-validation.
Results
Based on 20 pre-treatment variables, four subgroups (three for positive responders and one for negative responders) were identified. The log likelihood ratio tests showed that the subgroups were significant. Variable selection using the AIC kept two of the four subgroups, one for positive responders and one for negative responders. As for the validation of the whole modelling strategy, all reported measures (the number of pre-treatment variables associated with the outcome, number of subgroups, number of subgroups surviving variable selection and coverage) showed little variation.
Conclusions
With the proposed approach, we identified subgroups of positive and negative responders to treatment that markedly deviate from the mean response. The subgroups showed additive predictive value of the outcome. The modelling approach strategy was shown to be robust on this dataset. Our approach was thus able to discover understandable subgroups from observational data that have predictive value and which may be considered by the clinical users to get insight into who responds positively or negatively to a proposed treatment.
中文翻译:
通过亚组发现解释个体治疗因果效应的异质性:抗生素治疗急性鼻窦炎的观察性案例研究
目标
个体对相同的治疗可能会有不同的反应,因此需要了解这种因果个体治疗效果的异质性。我们提出并评估了一种建模方法,通过识别对治疗反应明显不同的患者亚组,以更好地理解观察性研究中的这种异质性。我们通过抗生素 (AB) 处方对急性鼻窦炎 (ARS) 康复的初级保健案例研究来说明这种方法。
方法
我们的方法包括四个阶段,并应用于初级保健数据集中的大型数据集,该数据集包含 24,392 名疑似患有 ARS 的患者。我们首先确定治疗前变量,这些变量要么混淆了治疗与结果之间的关系,要么是结果的危险因素。其次,基于预处理变量,我们创建合成随机森林 (SRF) 模型来计算潜在结果以及随后的因果个体治疗效应 (ITE) 估计。第三,我们使用 ITE 估计作为结果来执行子组发现,以识别积极和消极的响应者。第四,我们通过两种方式评估确定的亚组预测结果的预测性能:似然比检验,以及是否使用向后逐步变量选择通过 Akaike 信息准则 (AIC) 选择子组。我们通过 10 折交叉验证来验证整个建模策略。
结果
基于 20 个治疗前变量,确定了四个亚组(三个为阳性反应者,一个为阴性反应者)。对数似然比检验表明亚组显着。使用 AIC 的变量选择保留了四个亚组中的两个,一个用于积极响应者,另一个用于消极响应者。至于整个建模策略的验证,所有报告的测量(与结果相关的治疗前变量的数量、亚组的数量、幸存变量选择和覆盖的亚组的数量)几乎没有变化。
结论
使用提议的方法,我们确定了显着偏离平均反应的治疗阳性和阴性反应者的亚组。亚组显示了结果的附加预测值。建模方法策略被证明在这个数据集上是稳健的。因此,我们的方法能够从具有预测价值的观察数据中发现可理解的亚组,临床用户可以考虑这些亚组,以深入了解谁对提议的治疗有积极或消极的反应。