Novel principal component analysis-based feature selection mechanism for classroom sound classification,Computational Intelligence

当前位置： X-MOL 学术 › Comput. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Novel principal component analysis-based feature selection mechanism for classroom sound classification
Computational Intelligence ( IF 2.8 ) Pub Date : 2021-05-29 , DOI: 10.1111/coin.12468
Eleni Tsalera _{1,

2} , Andreas Papadakis ₂ , Maria Samarakou ₁

Affiliation

Machine learning algorithms for sound classification can be supported by multiple temporal, spectral, and perceptual features extracted from the sound signal. The number of features affects the classification accuracy but also the computational resources requested, so the number of features has to be carefully selected. In this work, we propose a methodology for feature selection based on the principal component analysis. The case study has been the classification of classroom sounds during face-to-face module delivery and six sound types have been defined. The proposed method is applied upon a set of 143 sound features to produce feature ranking. The ranking results are compared with those provided by the Relief-F. Then the selected features are used by five classification algorithms, Linear Discriminant Analysis (LDA), Quadratic Support Vector Machine (QSVM), k Nearest Neighbors, Boosted Trees, and Random Forest. The algorithms are executed with increasing number of features, from 1 to 143, considering both feature rankings, creating 1430 models. The performance of the classification algorithms increases rapidly with the number of features with LDA, QSVM, and Boosted Trees outperforming other methods and surpassing the accuracy ratio of 90% with 25 features.

中文翻译：

基于主成分分析的课堂声音分类特征选择机制

从声音信号中提取的多个时间、频谱和感知特征可以支持用于声音分类的机器学习算法。特征的数量会影响分类精度，但也会影响所需的计算资源，因此必须仔细选择特征的数量。在这项工作中，我们提出了一种基于主成分分析的特征选择方法。案例研究是面对面模块交付期间的课堂声音分类，并定义了六种声音类型。所提出的方法应用于一组 143 个声音特征以产生特征排序。将排名结果与 Relief-F 提供的结果进行比较。然后选择的特征被五种分类算法使用，线性判别分析（LDA），二次支持向量机 (QSVM)、k 最近邻、增强树和随机森林。考虑到两个特征排名，算法随着特征数量的增加（从 1 到 143）执行，创建了 1430 个模型。随着 LDA、QSVM 和 Boosted Trees 的特征数量超过其他方法，分类算法的性能迅速提高，25 个特征的准确率超过 90%。

更新日期：2021-05-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>