当前位置: X-MOL 学术Irbm › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detection of Breast Cancer Based on Fuzzy Frequent Itemsets Mining
IRBM ( IF 4.8 ) Pub Date : 2020-05-19 , DOI: 10.1016/j.irbm.2020.05.002
F. Ramesh Dhanaseelan , M. Jeya Sutha

Background: Breast cancer, a type of malignant tumor, affects women more than men. About one third of women with breast cancer die of this disease. Hence, it is imperative to find a tool for the proper identification and early treatment of breast cancer. Unlike the conventional data mining algorithms, fuzzy logic based approaches help in the mining of association rules from quantitative transactions.

Methods: In this study a novel fuzzy methodology IFFP (Improved Fuzzy Frequent Pattern Mining), based on a fuzzy association rule mining for biological knowledge extraction, is introduced to analyze the dataset in order to find the core factors that cause breast cancer. This method consists of two phases. During the first phase, fuzzy frequent itemsets are mined using the proposed algorithm IFFP. Fuzzy association rules are formed during the second phase, indicating whether a person belongs to benign or malignant. This algorithm is applied on WBCD (Wisconsin Breast Cancer Database) to detect the presence of breast cancer.

Results: It is determined that the factor, Mitoses has low range of values on both malignant and benign and hence it does not contribute to the detection of breast cancer. On the other hand, the high range of Bare Nuclei shows more chances for the presence of breast cancer.

Conclusion: Experimental evaluations on real datasets show that our proposed method outperforms recently proposed state-of-the-art algorithms in terms of runtime and memory usage.



中文翻译:

基于模糊频繁项集挖掘的乳腺癌检测

背景:乳腺癌是一种恶性肿瘤,对女性的影响要大于男性。约有三分之一的乳腺癌女性死于这种疾病。因此,必须找到正确识别和早期治疗乳腺癌的工具。与常规数据挖掘算法不同,基于模糊逻辑的方法有助于从定量交易中挖掘关联规则。

方法:在这项研究中,提出了一种新颖的模糊方法IFFP(改进的模糊频繁模式挖掘),该方法基于用于生物知识提取的模糊关联规则挖掘,以分析数据集,以找出导致乳腺癌的核心因素。该方法包括两个阶段。在第一阶段,使用提出的算法IFFP挖掘模糊频繁项集。在第二阶段形成模糊关联规则,指示一个人属于良性还是恶性。将该算法应用于WBCD(威斯康星州乳腺癌数据库)以检测乳腺癌的存在。

结果:已确定,线粒体因子在恶性和良性方面的数值范围都较小,因此对乳腺癌的检测没有帮助。另一方面,裸核的高射程表明存在乳腺癌的机会更大。

结论:对真实数据集的实验评估表明,在运行时和内存使用方面,我们提出的方法优于最近提出的最新算法。

更新日期:2020-05-19
down
wechat
bug