A review of infant cry analysis and classification,EURASIP Journal on Audio, Speech, and Music Processing

当前位置： X-MOL 学术 › EURASIP J. Audio Speech Music Proc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A review of infant cry analysis and classification
EURASIP Journal on Audio, Speech, and Music Processing ( IF 2.4 ) Pub Date : 2021-02-05 , DOI: 10.1186/s13636-021-00197-5
Chunyan Ji , Thosini Bamunu Mudiyanselage , Yutong Gao , Yi Pan

This paper reviews recent research works in infant cry signal analysis and classification tasks. A broad range of literatures are reviewed mainly from the aspects of data acquisition, cross domain signal processing techniques, and machine learning classification methods. We introduce pre-processing approaches and describe a diversity of features such as MFCC, spectrogram, and fundamental frequency, etc. Both acoustic features and prosodic features extracted from different domains can discriminate frame-based signals from one another and can be used to train machine learning classifiers. Together with traditional machine learning classifiers such as KNN, SVM, and GMM, newly developed neural network architectures such as CNN and RNN are applied in infant cry research. We present some significant experimental results on pathological cry identification, cry reason classification, and cry sound detection with some typical databases. This survey systematically studies the previous research in all relevant areas of infant cry and provides an insight on the current cutting-edge works in infant cry signal analysis and classification. We also propose future research directions in data processing, feature extraction, and neural network classification fields to better understand, interpret, and process infant cry signals.

中文翻译：

婴儿啼哭的分析和分类综述

本文回顾了婴儿啼哭信号分析和分类任务的最新研究成果。主要从数据采集，跨域信号处理技术和机器学习分类方法等方面对广泛的文献进行综述。我们介绍了预处理方法，并描述了诸如MFCC，频谱图和基频等多种特征。从不同域提取的声学特征和韵律特征可以彼此区分基于帧的信号，并且可以用于训练机器学习分类器。结合传统的机器学习分类器（例如KNN，SVM和GMM），新开发的神经网络架构（例如CNN和RNN）被应用于婴儿啼哭研究。我们提供了一些病理性哭声识别，哭声原因分类和一些典型数据库的哭声检测方面的重要实验结果。这项调查系统地研究了婴儿哭声所有相关领域的先前研究，并为婴儿哭声信号分析和分类的最新前沿工作提供了见识。我们还提出了数据处理，特征提取和神经网络分类领域的未来研究方向，以更好地理解，解释和处理婴儿啼哭信号。

更新日期：2021-02-05

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>