当前位置: X-MOL 学术BBA Mol. Basis Dis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analysis of gene expression profiles of lung cancer subtypes with machine learning algorithms.
Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease ( IF 6.2 ) Pub Date : 2020-04-28 , DOI: 10.1016/j.bbadis.2020.165822
Fei Yuan 1 , Lin Lu 2 , Quan Zou 3
Affiliation  

Lung cancer is one of the most common cancer types worldwide and causes more than one million deaths annually. Lung adenocarcinoma (AC) and lung squamous cell cancer (SCC) are two major lung cancer subtypes and have different characteristics in several aspects. Identifying their differentially expressed genes and different gene expression patterns can deepen our understanding of these two subtypes at the transcriptomic level. In this work, we used several machine learning algorithms to investigate the gene expression profiles of lung AC and lung SCC samples retrieved from Gene Expression Omnibus. First, the profiles were analyzed by using a powerful feature selection method, namely, Monte Carlo feature selection. A feature list, ranking all features according to their importance, and some informative features were obtained. Then, the feature list was used in the incremental feature selection method to extract optimal features, which can allow the support vector machine (SVM) to yield the best performance for classifying lung AC and lung SCC samples. Some top genes (CSTA, TP63, SERPINB13, CLCA2, BICD2, PERP, FAT2, BNC1, ATP11B, FAM83B, KRT5, PARD6G, PKP1) were extensively analyzed to prove that they can be differentially expressed genes between lung AC and lung SCC. Meanwhile, a rule learning procedure was applied on informative features to construct the classification rules. These rules provide a clear procedure of classification and show some different gene expression patterns between lung AC and lung SCC.

中文翻译:

用机器学习算法分析肺癌亚型的基因表达谱。

肺癌是全世界最常见的癌症类型之一,每年导致超过一百万的死亡。肺腺癌(AC)和肺鳞癌(SCC)是两种主要的肺癌亚型,在几个方面具有不同的特征。鉴定它们的差异表达基因和不同基因表达模式可以在转录组学水平上加深我们对这两种亚型的理解。在这项工作中,我们使用了几种机器学习算法来研究从Gene Expression Omnibus检索到的肺AC和肺SCC样品的基因表达谱。首先,使用强大的特征选择方法(即蒙特卡洛特征选择)来分析轮廓。一个功能列表,根据其重要性对所有功能进行排序,并获得了一些有用的功能。然后,在增量特征选择方法中使用特征列表来提取最佳特征,这可以使支持向量机(SVM)产生对肺AC和肺SCC样本进行分类的最佳性能。某些顶级基因(CSTA,TP63,SERPINB13,CLCA2,BICD2,PERP,FAT2,BNC1,ATP11B,FAM83B,KRT5,PARD6G,PKP1)进行了广泛分析,以证明它们可以在肺AC和肺SCC之间差异表达。同时,对信息特征应用规则学习程序来构造分类规则。这些规则提供了清晰的分类程序,并显示了肺AC和肺SCC之间的一些不同基因表达模式。这可以使支持向量机(SVM)产生对肺AC和肺SCC样本进行分类的最佳性能。某些顶级基因(CSTA,TP63,SERPINB13,CLCA2,BICD2,PERP,FAT2,BNC1,ATP11B,FAM83B,KRT5,PARD6G,PKP1)进行了广泛分析,以证明它们可以在肺AC和肺SCC之间差异表达。同时,对信息特征应用规则学习程序来构造分类规则。这些规则提供了清晰的分类程序,并显示了肺AC和肺SCC之间的一些不同基因表达模式。这可以使支持向量机(SVM)产生对肺AC和肺SCC样本进行分类的最佳性能。某些顶级基因(CSTA,TP63,SERPINB13,CLCA2,BICD2,PERP,FAT2,BNC1,ATP11B,FAM83B,KRT5,PARD6G,PKP1)进行了广泛分析,以证明它们可以在肺AC和肺SCC之间差异表达。同时,对信息特征应用规则学习程序来构造分类规则。这些规则提供了清晰的分类程序,并显示了肺AC和肺SCC之间的一些不同基因表达模式。在信息特征上应用规则学习程序来构造分类规则。这些规则提供了清晰的分类程序,并显示了肺AC和肺SCC之间的一些不同基因表达模式。在信息特征上应用规则学习程序来构造分类规则。这些规则提供了清晰的分类程序,并显示了肺AC和肺SCC之间的一些不同基因表达模式。
更新日期:2020-04-28
down
wechat
bug