当前位置: X-MOL 学术Allergy › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of chronic urticaria subtypes using machine learning algorithms
Allergy ( IF 12.4 ) Pub Date : 2021-10-04 , DOI: 10.1111/all.15119
Murat Türk 1 , Ragıp Ertaş 2 , Engin Zeydan 3 , Yekta Türk 4 , Mustafa Atasoy 2 , Annika Gutsche 5, 6 , Marcus Maurer 5, 6
Affiliation  

Chronic urticaria (CU) comes as chronic spontaneous urticaria (CSU) and chronic inducible urticaria (CIndU).1 Across its types and subtypes, CU is a heterogeneous disease that has different phenotypes with distinct clinical characteristics and different endotypes with distinct underlying pathophysiological mechanisms.2, 3 It may be possible that subtypes of CU patients exhibit distinct phenotypic disease signatures that can point to differences in what drives their condition and in their response to treatments. Cluster analysis is a popular unsupervised machine learning (ML) method for discovering previously undetected data patterns.4 ML-based cluster analysis has been used in several diseases for the identification and characterization of patient subgroups.5, 6 As of now, no study has attempted to identify CU subtypes with this method. Here, we performed a proof-of-concept study to test whether cluster analysis using ML algorithms can identify subgroups of CU patients based on clinical and routine laboratory characteristics.

We retrospectively analyzed the medical charts of a cohort of 431 CU patients. Institutional review board was obtained, and due to retrospective nature of the study, patient consent was not required. ML-based k-means clustering with principal component silhouette analyses (PCA) and use of the elbow method of dimensionally reduced data showed 4 clusters of CU patients, with a homogeneous balance between the clusters and the selected evaluation metrics (methods are provided in supplementary material) (Figure S3). Clustering analyses with PCA resulted in more meaningful clusters than without and supported the positive impact of reduced dimensions, and cluster number identified. Cluster characteristics and comparisons identified clinically distinct patient subgroups (Table 1).

TABLE 1. Cluster analyses with machine learning-based algorithms identify four distinct subgroups of patients with chronic urticaria
All patients n = 337 (100%) Cluster 1 n = 25 (7.4%) Cluster 2 n = 142 (42.1%) Cluster 3 n = 128 (38%) Cluster 4 n = 42 (12.5%) p-value*
CSU; n (%) 312 (93) 0 (0)a 142 (100)b 128 (100)b 42 (100)b <0.001
CIndU; n (%) 172 (51) 25 (100)a 69 (49)b 72 (56)b 6 (14)c <0.001
Angioedema; n (%) 198 (59) 12 (48)a,b 56 (39)b 99 (77)c 31 (74)a,c <0.001
Median age in years (IQR) 39 (28–49) 42 (28–51)a 3 (29–48)a 41 (29–50)a 38 (26–46)a 0.481
Female gender; n (%) 237 (70) 16 (64)a,b 71 (50)b 117 (92)c 33 (79)a,c <0.001
CU duration; months (IQR) 24 (9–76) 12 (5–96)a,b 36 (12–96)a 18 (9–50)b 24 (6–120)a,b 0.039
Family history; n (%) 72 (21) 6 (24)a,b 20 (14)b 37 (29)a 9 (21)a,b 0.03
Triggering factor(s); n (%) 262 (78) 18 (72)a 115 (81)a 95 (74)a 34 (81)a 0.474
IgE; IU/ml (IQR) 102 (38–226) 75 (35–189)a,b 132 (56–272)a 84 (23–167)b 93 (26–203)a,b 0.005
IgG-anti-TPO positivity; n (%) 68 (20) 4 (16)a,b,c 6 (4.2)c 51 (40)b 7 (17)a <0.001
ANA positivity; n (%) 82 (24) 2 (8)a,b 2 (1.4)b 67 (52)c 11 (26)a <0.001
Hypertension; n (%) 37 (11) 5 (20)a 0 (0)b 1 (1)b 31 (74)c <0.001
Diabetes mellitus; n (%) 45 (14) 3 (12)a 12 (9)a 4 (3)a 26 (62)b <0.001
Hypothyroidism; n (%) 64 (19) 6 (24)a,b 19 (13)b 23 (18)a 16 (38)a,b 0.004
Psychiatric disease; n (%) 115 (34) 10 (40)a,b 57 (40)b 30 (23)a 18 (43)a,b 0.014
Rheum. disease; n (%) 57 (17) 5 (20)a 28 (20)a 17 (13)a 7 (17)a 0.538
Atopic dermatitis; n (%) 12 (4) 0 (0)a,b 11 (8)b 1 (1)a 0 (0)a,b 0.006
Asthma; n (%) 57 (17) 5 (20)a 22 (16)a 20 (16)a 10 (24)a 0.584

Note

  • *p-value from Kruskal-Wallis H test or Pearson chi-square analysis between the 4 clusters. Each superscript letter (a, b, and c) denotes pairwise comparisons between clusters and shows that the columns with the same letters in a line do not differ significantly from each other at the 0.05 level.
  • Abbreviations: ANA, antinuclear antibodies; CIndU, chronic inducible urticaria; CSU, chronic spontaneous urticaria; CU, chronic urticaria; IgE, serum total IgE level; IQR, interquartile range; TPO, thyroid peroxidase.

Cluster 1 (The “CIndU only” cluster) was the smallest cluster and consisted of all and only CIndU patients who did not have comorbid CSU. Of all clusters, cluster 1 patients had the highest age [median 42 (28–51) years], the shortest duration of disease [12 (5–96) months], and the lowest IgE levels [74.6 (35.1–188.5) IU/ml].

Cluster 2 (The “high IgE” cluster) was the largest cluster. All patients had CSU, and half of them had comorbid CIndU. Cluster 2 patients, on average, had the highest IgE levels [132 (56.4–271.5) IU/ml], the highest rate of comorbid atopic dermatitis (7.7%), and the lowest rate of ANA and IgG-anti-TPO positivity (1.4% and 4.2%, respectively).

Cluster 3 (The “autoimmune” cluster) had the highest percentage of women (92%) in all clusters. All patients had CSU, and more than half also had CIndU (56.3%). Three of four patients (77.3%) had angioedema, the highest percentage of any cluster. Cluster 3 patients also had the second-lowest IgE levels (84.2 IU/ml) of any CSU cluster and the highest rates of IgG-anti-TPO and ANA positivity across all clusters (39.8% and 52.3%, respectively).

Cluster 4 (The “high comorbidity” cluster) consisted only of CSU patients, and comorbid CIndU was rare (14.3%). The defining characteristics of patients in this cluster, the high comorbidity cluster, were their high rates of hypertension (74%), diabetes mellitus (62%), and hypothyroidism (38%), each at least twice as high as in any other cluster.

The results of our study provide proof of concept that the use of unsupervised ML algorithms can identify meaningful and distinct groups of patients with CU and cluster CU into four different and distinct subtypes. Three of these four clusters are remarkably similar to how patients with CU are classified in real life, that is, as having CIndU or CSU as their primary form of CU and, in the latter, as having autoimmune or autoallergic CSU (Figure 1). This suggests that ML-based algorithms can be used to establish patient signatures, which may then be used to better characterize relevant and distinct pathomechanisms of CU subgroups. This, in turn, will allow us to better manage CU, by optimizing the use of available treatments and guiding the development of new and better ones.

image
FIGURE 1
Open in figure viewerPowerPoint
The four chronic urticaria clusters. The clusters are plotted according to the characteristics of how chronic urticaria (CU) is classified in real life as having chronic inducible urticaria (CIndU) or chronic spontaneous urticaria (CSU), and, for the latter, as having autoimmune or autoallergic CSU. Cluster 1 is the “CIndU cluster,” cluster 2 is the “high IgE cluster,” cluster 3 is the “autoimmune cluster,” and cluster 4 is the “high comorbidity” cluster. Surface areas represent the size of the population distributed to each cluster (ANA, antinuclear antibodies; IgE, serum total IgE level; and TPO, thyroid peroxidase)


中文翻译:

使用机器学习算法识别慢性荨麻疹亚型

慢性荨麻疹(CU)分为慢性自发性荨麻疹(CSU)和慢性诱发性荨麻疹(CIndU)。1在其类型和亚型中,CU 是一种异质性疾病,具有具有不同临床特征的不同表型和具有不同潜在病理生理机制的不同内型。2, 3 CU 患者的亚型可能表现出不同的表型疾病特征,这些特征可能表明导致他们的病情和对治疗的反应存在差异。聚类分析是一种流行的无监督机器学习 (ML) 方法,用于发现以前未检测到的数据模式。4基于 ML 的聚类分析已用于多种疾病,用于识别和表征患者亚组。5, 6到目前为止,还没有研究试图用这种方法识别 CU 亚型。在这里,我们进行了概念验证研究,以测试使用 ML 算法的聚类分析是否可以根据临床和常规实验室特征识别 CU 患者的亚组。

我们回顾性分析了 431 名 CU 患者队列的病历。获得了机构审查委员会,并且由于研究的回顾性,不需要患者同意。基于 ML 的 k 均值聚类与主成分轮廓分析 (PCA) 和使用降维数据的肘部方法显示 4 个 CU 患者集群,在集群和所选评估指标之间具有均匀的平衡(方法在补充中提供材料)(图 S3)。与不使用 PCA 相比,使用 PCA 进行聚类分析会产生更有意义的聚类,并支持减少维度和确定聚类数量的积极影响。聚类特征和比较确定了临床上不同的患者亚组(表 1)。

表 1.使用基于机器学习的算法进行的聚类分析确定了四个不同的慢性荨麻疹患者亚组
所有患者n  = 337 (100%) 集群 1 n  = 25 (7.4%) 集群 2 n  = 142 (42.1%) 集群 3 n  = 128 (38%) 集群 4 n  = 42 (12.5%) p 值*
科罗拉多州立大学;n (%) 312 (93) 0 (0) 142 (100) 128(100) 42(100) <0.001
CINDU; n (%) 172 (51) 25 (100) 69(49) 72(56) 6 (14) c <0.001
血管性水肿; n (%) 198 (59) 12 (48) a,b 56(39) 99 (77) c 31 (74)一、三 <0.001
中位年龄(IQR) 39 (28–49) 42 (28–51) 3 (29–48) 41 (29–50)一个 38 (26–46) 0.481
女性;n (%) 237 (70) 16 (64) a、b 71(50) 117 (92) c 33 (79) a, c <0.001
CU 持续时间;月 (IQR) 24 (9–76) 12 (5–96) a,b 36 (12–96) 18 (9–50) b 24 (6–120) a,b 0.039
家史; n (%) 72 (21) 6 (24) a、b 20(14) 37 (29) 9 (21) a、b 0.03
触发因素;n (%) 262 (78) 18 (72) 115 (81) 95 (74)一个 34 (81) 0.474
免疫球蛋白E;国际单位/毫升 (IQR) 102 (38–226) 75 (35–189) a,b 132 (56–272) 84 (23–167) b 93 (26–203) a,b 0.005
IgG-抗-TPO阳性;n (%) 68 (20) 4 (16) a,b,c 6 (4.2) c 51(40) 七(十七) <0.001
ANA阳性;n (%) 82 (24) 2 (8) a, b 2 (1.4) b 67 (52) c 11(26) <0.001
高血压; n (%) 37 (11) 5 (20) 0 (0) 1(1) 31 (74) c <0.001
糖尿病; n (%) 45 (14) 3 (12) 12 (9)一个 4 (3)一个 26(62) <0.001
甲状腺功能减退症; n (%) 64 (19) 6 (24) a、b 19(13) 23 (18) 16 (38) a、b 0.004
精神疾病; n (%) 115 (34) 10 (40) a,b 57(40) 30 (23)一个 18 (43) a、b 0.014
感冒。疾病; n (%) 57 (17) 5 (20) 28 (20) 十七(十三) 七(十七) 0.538
特应性皮炎; n (%) 十二(4) 0 (0) a,b 11(8) 1 (1) 0 (0) a,b 0.006
哮喘; n (%) 57 (17) 5 (20) 22 (16) 20 (16) 10 (24) 0.584

笔记

  • * p -来自 4 个集群之间的 Kruskal-Wallis H 检验或 Pearson 卡方分析的值。每个上标字母(a、b 和 c)表示聚类之间的成对比较,并表明一行中具有相同字母的列在 0.05 水平上没有显着差异。
  • 缩写:ANA,抗核抗体;CIndU,慢性诱发性荨麻疹;CSU,慢性自发性荨麻疹;CU,慢性荨麻疹;IgE,血清总 IgE 水平;IQR,四分位距;TPO,甲状腺过氧化物酶。

集群 1(“仅 CIndU”集群)是最小的集群,由所有且仅有的没有合并 CSU 的 CIndU 患者组成。在所有集群中,集群 1 患者的年龄最高 [中位 42 (28–51) 岁],病程最短 [12 (5–96) 个月],IgE 水平最低 [74.6 (35.1–188.5) IU /毫升]。

集群 2(“高 IgE”集群)是最大的集群。所有患者均患有 CSU,其中一半患有合并症 CIndU。平均而言,第 2 组患者的 IgE 水平最高 [132 (56.4–271.5) IU/ml],共病特应性皮炎发生率最高 (7.7%),ANA 和 IgG 抗 TPO 阳性率最低。分别为 1.4% 和 4.2%)。

集群 3(“自身免疫”集群)在所有集群中女性比例最高(92%)。所有患者都有 CSU,超过一半的患者也有 CIndU(56.3%)。四名患者中的三名(77.3%)患有血管性水肿,是所有集群中比例最高的。在所有 CSU 集群中,集群 3 患者的 IgE 水平第二低(84.2 IU/ml),IgG 抗 TPO 和 ANA 阳性率最高(分别为 39.8% 和 52.3%)。

集群 4(“高合并症”集群)仅由 CSU 患者组成,合并 CIndU 很少见(14.3%)。该集群(高合并症集群)中患者的定义特征是他们的高血压(74%)、糖尿病(62%)和甲状腺功能减退(38%)的高发病率,每一个都至少是任何其他集群的两倍.

我们的研究结果提供了概念证明,即使用无监督 ML 算法可以识别有意义且不同的 CU 患者组,并将 CU 分为四种不同且不同的亚型。这四个集群中的三个与 CU 患者在现实生活中的分类方式非常相似,即以 CIndU 或 CSU 作为其主要形式的 CU,而在后者中,具有自身免疫性或自身过敏性 CSU(图 1)。这表明基于 ML 的算法可用于建立患者特征,然后可用于更好地表征 CU 亚组的相关和不同的病理机制。反过来,这将使我们能够通过优化现有治疗方法的使用并指导开发新的更好的治疗方法来更好地管理 CU。

图片
图1
在图形查看器中打开微软幻灯片软件
四种慢性荨麻疹群。根据慢性荨麻疹 (CU) 在现实生活中如何分类为慢性诱发性荨麻疹 (CIndU) 或慢性自发性荨麻疹 (CSU) 以及对于后者而言,分类为具有自身免疫性或自身过敏性 CSU 的特征绘制集群。集群 1 是“CIndU 集群”,集群 2 是“高 IgE 集群”,集群 3 是“自身免疫集群”,集群 4 是“高合并症”集群。表面积代表分布到每个集群的种群大小(ANA,抗核抗体;IgE,血清总 IgE 水平;TPO,甲状腺过氧化物酶)
更新日期:2021-10-04
down
wechat
bug