当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Novel Method for Constructing Classification Models by Combining Different Biomarker Patterns
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2020-09-07 , DOI: 10.1109/tcbb.2020.3022076
Xin Huang 1 , Zhenqian Liao 1 , Bing Liu 1 , Fengmei Tao 1 , Benzhe Su 2 , Xiaohui Lin 2
Affiliation  

Different biomarker patterns, such as those of molecular biomarkers and ratio biomarkers, have their own merits in clinical applications. In this study, a novel machine learning method used in biomedical data analysis for constructing classification models by combining different biomarker patterns (CDBP)is proposed. CDBP uses relative expression reversals to measure the discriminative ability of different biomarker patterns, and selects the pattern with the higher score for classifier construction. The decision boundary of CDBP can be characterized in simple and biologically meaningful manners. The CDBP method was compared with eight state-of-the-art methods on eight gene expression datasets to test its performance. CDBP, with fewer features or ratio features, had the highest classification performance. Subsequently, CDBP was employed to extract crucial diagnostic information from a rat hepatocarcinogenesis metabolomics dataset. The potential biomarkers selected by CDBP provided better classification of hepatocellular carcinoma (HCC)and non-HCC stages than previous works in the animal model. The statistical analyses of these potential biomarkers in an independent human dataset confirmed their discriminative abilities of different liver diseases. These experimental results highlight the potential of CDBP for biomarker identification from high-dimensional biomedical datasets and demonstrate that it can be a useful tool for disease classification.

中文翻译:

一种结合不同生物标志物模式构建分类模型的新方法

不同的生物标志物模式,如分子生物标志物和比率生物标志物,在临床应用中各有优势。在这项研究中,提出了一种用于生物医学数据分析的新型机器学习方法,该方法通过结合不同的生物标志物模式(CDBP)来构建分类模型。CDBP 使用相对表达反转来衡量不同生物标志物模式的区分能力,并选择得分较高的模式进行分类器构建。CDBP 的决策边界可以用简单且具有生物学意义的方式来表征。CDBP 方法在八个基因表达数据集上与八种最先进的方法进行了比较,以测试其性能。具有较少特征或比率特征的 CDBP 具有最高的分类性能。随后,CDBP 用于从大鼠肝癌发生代谢组学数据集中提取关键诊断信息。与之前在动物模型中的工作相比,CDBP 选择的潜在生物标志物提供了更好的肝细胞癌 (HCC) 和非 HCC 阶段分类。在独立的人类数据集中对这些潜在生物标志物的统计分析证实了它们对不同肝病的鉴别能力。这些实验结果突出了 CDBP 从高维生物医学数据集中识别生物标志物的潜力,并证明它可以成为疾病分类的有用工具。与之前在动物模型中的工作相比,CDBP 选择的潜在生物标志物提供了更好的肝细胞癌 (HCC) 和非 HCC 阶段分类。在独立的人类数据集中对这些潜在生物标志物的统计分析证实了它们对不同肝病的鉴别能力。这些实验结果突出了 CDBP 从高维生物医学数据集中识别生物标志物的潜力,并证明它可以成为疾病分类的有用工具。与之前在动物模型中的工作相比,CDBP 选择的潜在生物标志物提供了更好的肝细胞癌 (HCC) 和非 HCC 阶段分类。在独立的人类数据集中对这些潜在生物标志物的统计分析证实了它们对不同肝病的鉴别能力。这些实验结果突出了 CDBP 从高维生物医学数据集中识别生物标志物的潜力,并证明它可以成为疾病分类的有用工具。
更新日期:2020-09-07
down
wechat
bug