当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DICE: Deep Significance Clustering for Outcome-Aware Stratification
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2021-01-07 , DOI: arxiv-2101.02344
Yufang Huang, Kelly M. Axsom, John Lee, Lakshminarayanan Subramanian, Yiye Zhang

We present deep significance clustering (DICE), a framework for jointly performing representation learning and clustering for "outcome-aware" stratification. DICE is intended to generate cluster membership that may be used to categorize a population by individual risk level for a targeted outcome. Following the representation learning and clustering steps, we embed the objective function in DICE with a constraint which requires a statistically significant association between the outcome and cluster membership of learned representations. DICE further includes a neural architecture search step to maximize both the likelihood of representation learning and outcome classification accuracy with cluster membership as the predictor. To demonstrate its utility in medicine for patient risk-stratification, the performance of DICE was evaluated using two datasets with different outcome ratios extracted from real-world electronic health records. Outcomes are defined as acute kidney injury (30.4\%) among a cohort of COVID-19 patients, and discharge disposition (36.8\%) among a cohort of heart failure patients, respectively. Extensive results demonstrate that DICE has superior performance as measured by the difference in outcome distribution across clusters, Silhouette score, Calinski-Harabasz index, and Davies-Bouldin index for clustering, and Area under the ROC Curve (AUC) for outcome classification compared to several baseline approaches.

中文翻译:

DICE:深度意义聚类,用于结果感知分层

我们提出了深度重要性聚类(DICE),该框架可共同执行表示学习和“结果感知”分层的聚类。DICE旨在生成集群成员资格,可用于根据个体风险水平对人群进行分类以实现目标结果。按照制图表达学习和聚类步骤,我们将目标函数嵌入DICE中,并带有约束条件,该约束条件要求结果与学习的制图表达的集群成员之间具有统计上显着的关联。DICE还包括一个神经体系结构搜索步骤,以集群成员身份作为预测变量,最大化表示学习的可能性和结果分类的准确性。为了证明其在药物中用于患者风险分层的效用,使用从实际电子健康记录中提取的具有不同结果比率的两个数据集来评估DICE的性能。结果分别定义为一组COVID-19患者的急性肾损伤(30.4 \%)和一组心力衰竭患者的出院倾向(36.8 \%)。广泛的结果表明,通过比较各个集群的结果分布差异,Silhouette得分,Calinski-Harabasz指数和Davies-Bouldin指数(用于集群)以及ROC曲线下面积(AUC)进行结果分类,DICE具有优异的性能,基线方法。一组心力衰竭患者的出院和出院布置(36.8%)。广泛的结果表明,通过比较各个集群的结果分布差异,Silhouette得分,Calinski-Harabasz指数和Davies-Bouldin指数(用于集群)以及ROC曲线下面积(AUC)进行结果分类,DICE具有优异的性能,基线方法。一组心力衰竭患者的出院和出院布置(36.8%)。广泛的结果表明,通过比较各个集群的结果分布差异,Silhouette得分,Calinski-Harabasz指数和Davies-Bouldin指数(用于集群)以及ROC曲线下面积(AUC)进行结果分类,DICE具有优异的性能,基线方法。
更新日期:2021-01-08
down
wechat
bug