当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying Biomarkers Using Support Vector Machine to Understand the Racial Disparity in Triple-Negative Breast Cancer
Journal of Computational Biology ( IF 1.7 ) Pub Date : 2023-01-30 , DOI: 10.1089/cmb.2022.0422
Bikram Sahoo 1 , Zandra Pinnix 2 , Seth Sims 1 , Alex Zelikovsky 1
Affiliation  

With the properties of aggressive cancer and heterogeneous tumor biology, triple-negative breast cancer (TNBC) is a type of breast cancer known for its poor clinical outcome. The lack of estrogen, progesterone, and human epidermal growth factor receptor in the tumors of TNBC leads to fewer treatment options in clinics. The incidence of TNBC is higher in African American (AA) women compared with European American (EA) women with worse clinical outcomes. The significant factors responsible for the racial disparity in TNBC are socioeconomic lifestyle and tumor biology. The current study considered the open-source gene expression data of triple-negative breast cancer samples' racial information. We implemented a state-of-the-art classification Support Vector Machine (SVM) method with a recurrent feature elimination approach to the gene expression data to identify significant biomarkers deregulated in AA women and EA women. We also included Spearman's rho and Ward's linkage method in our feature selection workflow. Our proposed method generates 24 features/genes that can classify the AA and EA samples 98% accurately. We also performed the Kaplan–Meier analysis and log-rank test on the 24 features/genes. We only discussed the correlation between deregulated expression and cancer progression with a poor survival rate of 2 genes, KLK10 and LRRC37A2, out of 24 genes. We believe that further improvement of our method with a higher number of RNA-seq gene expression data will more accurately provide insight into racial disparity in TNBC.

中文翻译:

使用支持向量机识别生物标志物以了解三阴性乳腺癌的种族差异

由于具有侵袭性癌症和异质性肿瘤生物学特性,三阴性乳腺癌(TNBC)是一种以其临床结果不佳而闻名的乳腺癌类型。TNBC肿瘤中缺乏雌激素、孕激素和人表皮生长因子受体,导致临床治疗选择较少。与欧洲裔美国 (EA) 女性相比,非洲裔美国 (AA) 女性 TNBC 的发病率较高,且临床结果较差。造成 TNBC 种族差异的重要因素是社会经济生活方式和肿瘤生物学。当前的研究考虑了三阴性乳腺癌样本种族信息的开源基因表达数据。我们实施了最先进的分类支持向量机 (SVM) 方法,对基因表达数据采用循环特征消除方法,以识别 AA 女性和 EA 女性中解除管制的重要生物标志物。我们还在我们的特征选择工作流程中加入了 Spearman 的 rho 和 Ward 的链接方法。我们提出的方法生成 24 个特征/基因,可以对 AA 和 EA 样本进行 98% 的准确分类。我们还对 24 个特征/基因进行了 Kaplan-Meier 分析和对数秩检验。我们只讨论了表达失调与癌症进展之间的相关性,其中 2 个基因的存活率较低,我们提出的方法生成 24 个特征/基因,可以对 AA 和 EA 样本进行 98% 的准确分类。我们还对 24 个特征/基因进行了 Kaplan-Meier 分析和对数秩检验。我们只讨论了表达失调与癌症进展之间的相关性,其中 2 个基因的存活率较低,我们提出的方法生成 24 个特征/基因,可以对 AA 和 EA 样本进行 98% 的准确分类。我们还对 24 个特征/基因进行了 Kaplan-Meier 分析和对数秩检验。我们只讨论了表达失调与癌症进展之间的相关性,其中 2 个基因的存活率较低,KLK10LRRC37A2,共 24 个基因。我们相信,通过更多数量的 RNA-seq 基因表达数据进一步改进我们的方法将更准确地洞察 TNBC 的种族差异。
更新日期:2023-01-31
down
wechat
bug