当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Antlion re-sampling based deep neural network model for classification of imbalanced multimodal stroke dataset
Multimedia Tools and Applications ( IF 3.6 ) Pub Date : 2020-10-09 , DOI: 10.1007/s11042-020-09988-y
Thippa Reddy G , Sweta Bhattacharya , Praveen Kumar Reddy Maddikunta , Saqib Hakak , Wazir Zada Khan , Ali Kashif Bashir , Alireza Jolfaei , Usman Tariq

Stroke is enlisted as one of the leading causes of death and serious disability affecting millions of human lives across the world with high possibilities of becoming an epidemic in the next few decades. Timely detection and prompt decision making pertinent to this disease, plays a major role which can reduce chances of brain death, paralysis and other resultant outcomes. Machine learning algorithms have been a popular choice for the diagnosis, analysis and predication of this disease but there exists issues related to data quality as they are collected cross-institutional resources. The present study focuses on improving the quality of stroke data implementing a rigorous pre-processing technique. The present study uses a multimodal stroke dataset available in the publicly available Kaggle repository. The missing values in this dataset are replaced with attribute means and LabelEncoder technique is applied to achieve homogeneity. However the dataset considered was observed to be imbalanced which reflect that the results may not represent the actual accuracy and would be biased. In order to overcome this imbalance, resampling technique was used. In case of oversampling, some data points in the minority class are replicated to increase the cardinality value and rebalance the dataset. transformed and oversampled data is further normalized using Standardscalar technique. Antlion optimization (ALO) algorithm is implemented on the deep neural network (DNN) model to select optimal hyperparameters in minimal time consumption. The proposed model consumed only 38.13% of the training time which was also a positive aspect. The experimental results proved the superiority of proposed model.



中文翻译:

基于Antlion重采样的深度神经网络模型用于不平衡多峰笔划数据集的分类

中风是导致死亡和严重残疾的主要原因之一,严重影响世界上数百万人的生命,并有可能在未来几十年内成为流行病。与该疾病相关的及时发现和及时决策起着重要作用,可以减少脑死亡,麻痹和其他结果的机会。机器学习算法已成为诊断,分析和预测这种疾病的流行选择,但是由于跨机构资源收集它们,因此存在与数据质量相关的问题。本研究的重点是通过实施严格的预处理技术来提高笔画数据的质量。本研究使用可公开获得的Kaggle存储库中的多模式笔划数据集。该数据集中的缺失值被替换为属性均值,并应用LabelEncoder技术实现了同质性。但是,观察到的数据集不平衡,这表明结果可能不代表实际的准确性,并且会产生偏差。为了克服这种不平衡,使用了重采样技术。在过采样的情况下,将复制少数类中的某些数据点以增加基数值并重新平衡数据集。使用Standardscalar技术进一步对转换后的数据和过采样数据进行规范化。在深层神经网络(DNN)模型上实现Antlion优化(ALO)算法,以在最短的时间消耗中选择最佳超参数。提出的模型仅消耗了38.13%的训练时间,这也是一个积极的方面。

更新日期:2020-10-11
down
wechat
bug