当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
A feature-fusion framework of clinical, genomics, and histopathological data for METABRIC breast cancer subtype classification
Applied Soft Computing ( IF 4.873 ) Pub Date : 2020-03-24 , DOI: 10.1016/j.asoc.2020.106238
Ala’a El-Nabawy; Nashwa El-Bendary; Nahla A. Belal

Breast cancer is the most common cancer type attacking women worldwide. Also, breast cancer has been phenotypically classified into five subtypes. Each subtype group has unique characteristics that demonstrate the heterogeneity present within the breast cancer tumour. In 2012, the American Association for Cancer Research provided a population based molecular integrative clusters for the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) dataset, resulting in ten subtypes. Previous work on the METABRIC dataset used only gene expression data to figure out the effective genes for each subtype, without applying integration to benefit from all data sources. The objective of this paper is to present a breast cancer subtype classification model that applies feature fusion on the METABRIC datasets, namely clinical, gene expression, Copy Number Aberrations (CNA), Copy Number Variations (CNV), and histopathological images. State-of-the-art machine learning classifiers were applied on different data profiles, including Linear-SVM, Radial-SVM, Random Forests (RF), Ensemble SVM (E-SVM), and Boosting. The highest accuracy achieved for IntClust subtyping was 88.36% using Linear-SVM, applied on the data profile with features fused from the clinical, gene expression, CNA, and CNV datasets, with a Jaccard and Dice scores of 0.802 and 0.8835, respectively. On the other hand, for the Pam50 subtyping, an accuracy of 97.1% was achieved, Jaccard score ranging from 0.9439 to 0.9472, and Dice score of 0.971, using Linear-SVM and E-SVM classifiers, with several data profiles that include features from histopathological images. Conclusively, the significance of our study is to validate that using feature fusion from various METABRIC datasets improves breast cancer subtypes classification performance. Moreover, histopathological images give promising results on Pam50 subtypes, and it is expected to improve the accuracy for IntClust subtyping when applied on a higher population.
更新日期:2020-03-26

 

全部期刊列表>>
全球疫情及响应:BMC Medicine专题征稿
欢迎探索2019年最具下载量的化学论文
新版X-MOL期刊搜索和高级搜索功能介绍
化学材料学全球高引用
ACS材料视界
南方科技大学
x-mol收录
南方科技大学
自然科研论文编辑服务
上海交通大学彭文杰
中国科学院长春应化所于聪-4-8
武汉工程大学
课题组网站
X-MOL
深圳大学二维材料实验室张晗
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug