当前位置: X-MOL 学术Appl. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Combined Generative Adversarial Network and Fuzzy C‑Means Clustering for Multi‑Class Voice Disorder Detection with an Imbalanced Dataset
Applied Sciences ( IF 2.838 ) Pub Date : 2020-07-01 , DOI: 10.3390/app10134571
Kwok Tai Chui , Miltiadis D. Lytras , Pandian Vasant

The world has witnessed the success of artificial intelligence deployment for smart healthcare applications. Various studies have suggested that the prevalence of voice disorders in the general population is greater than 10%. An automatic diagnosis for voice disorders via machine learning algorithms is desired to reduce the cost and time needed for examination by doctors and speech‑language pathologists. In this paper, a conditional generative adversarial network (CGAN) and improved fuzzy c‑means clustering (IFCM) algorithm called CGAN‑IFCM is proposed for the multi‑class voice disorder detection of three common types of voice disorders. Existing benchmark datasets for voice disorders, the Saarbruecken Voice Database (SVD) and the Voice ICar fEDerico II Database (VOICED), use imbalanced classes. A generative adversarial network offers synthetic data to reduce bias in the detection model. Improved fuzzy c‑means clustering considers the relationship between adjacent data points in the fuzzy membership function. To explain the necessity of CGAN and IFCM, a comparison is made between the algorithm with CGAN and that without CGAN. Moreover, the performance is compared between IFCM and traditional fuzzy c‑means clustering. Lastly, the proposed CGAN‑IFCM outperforms existing models in its true negative rate and true positive rate by 9.9–12.9% and 9.1–44.8%, respectively.

中文翻译:

结合生成对抗网络和模糊C均值聚类,用于不平衡数据集的多类语音障碍检测

全世界已经见证了用于智能医疗保健应用的人工智能部署的成功。各种研究表明,一般人群中语音障碍的患病率大于10%。希望通过机器学习算法对语音障碍进行自动诊断,以减少医生和言语病理学家进行检查所需的成本和时间。本文提出了一种条件生成对抗网络(CGAN)和一种改进的模糊c均值聚类(IFCM)算法,称为CGAN‑IFCM,用于三种常见类型语音障碍的多类别语音障碍检测。语音障碍的现有基准数据集,萨尔布吕肯语音数据库(SVD)和语音ICar fEDerico II数据库(VOICED)使用不平衡类。生成对抗网络可提供综合数据,以减少检测模型中的偏差。改进的模糊c均值聚类考虑了模糊隶属函数中相邻数据点之间的关系。为了解释CGAN和IFCM的必要性,对使用CGAN的算法和不使用CGAN的算法进行了比较。此外,在IFCM和传统的模糊c均值聚类之间比较了性能。最后,拟议的CGAN‑IFCM的真实负利率和真实正率分别优于现有模型9.9–12.9%和9.1–44.8%。对使用CGAN的算法和不使用CGAN的算法进行比较。此外,在IFCM和传统的模糊c均值聚类之间对性能进行了比较。最后,拟议的CGAN‑IFCM的真实负利率和真实正率分别优于现有模型9.9–12.9%和9.1–44.8%。对使用CGAN的算法和不使用CGAN的算法进行比较。此外,在IFCM和传统的模糊c均值聚类之间比较了性能。最后,拟议的CGAN‑IFCM的真实负利率和真实正率分别优于现有模型9.9–12.9%和9.1–44.8%。
更新日期:2020-07-01
down
wechat
bug