当前位置: X-MOL 学术Comput. Electron. Agric. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A tristimulus-formant model for automatic recognition of call types of laying hens
Computers and Electronics in Agriculture ( IF 7.7 ) Pub Date : 2021-06-16 , DOI: 10.1016/j.compag.2021.106221
Xiaodong Du , Guanghui Teng , Chaoyuan Wang , Lenn Carpentier , Tomas Norton

An essential objective of Precision Livestock Farming (PLF) is to use sensors that monitor bio-responses that contain important information on the health, well-being and productivity of farmed animals. In the literature, vocalisations of animals have been shown to contain information that can enable farmers to improve their animal husbandry practices. In this study, we focus on the vocalisation bio-responses of birds and specifically develop a sound recognition technique for continuous and automatic assessment of laying hen vocalisations. This study introduces a novel feature called the “tristimulus-formant” for the recognition of call types of laying hens (i.e., vocalisation types). Tristimulus is considered to be a timbre that is equivalent to the colour attributes of vision. Tristimulus measures the mixture of harmonics in a given sound, which grouped into 3 sections according to the relative weights of the harmonics in the signal. Experiments were designed in which calls from 11 Hy-Line brown hens were recorded in a cage-free setting (4303 vocalisations were labelled from 168 h of sound recordings). Then, sound processing techniques were used to extract the features of each call type and to classify the vocalisations using the LabVIEW® software. For feature extraction, we focused on extracting the Mel frequency cepstral coefficients (MFCCs) and tristimulus-formant (TF) features. Then, two different classifiers, the backpropagation neural network (BPNN) and Gaussian mixture model (GMM), were applied to recognise different call types. Finally, comparative trials were designed to test the different recognition models. The results show that the MFCCs-12+BPNN model (12 variables) had the highest average accuracy of 94.9 ± 1.6% but had the highest model training time (3201 ± 119 ms). At the same time, the MFCCs-3+TF+BPNN model had fewer feature dimensionalities (6 variables) and required less training time (2633 ± 54 ms) than the MFCCs-12+BPNN model and could classify well without compromising accuracy (91.4 ± 1.4%). Additionally, the BPNN classifier was better than the GMM classifier in recognising laying hens’ calls. The novel model can classify chicken sounds effectively at a low computational cost, giving it considerable potential for large data analysis and online monitoring systems.



中文翻译:

蛋鸡鸣叫类型自动识别的三刺激共振峰模型

精准畜牧业 (PLF) 的一个基本目标是使用传感器来监测生物反应,这些生物反应包含有关养殖动物健康、福祉和生产力的重要信息。在文献中,动物的发声已被证明包含可以使农民改善其畜牧业实践的信息。在这项研究中,我们专注于鸟类的发声生物反应,并专门开发了一种声音识别技术,用于连续和自动评估蛋鸡发声。这项研究引入了一种称为“三刺激共振峰”的新功能,用于识别蛋鸡的叫声类型(即发声类型)。三色被认为是等同于视觉颜色属性的音色。三刺激测量给定声音中谐波的混合,根据信号中谐波的相对权重分为 3 个部分。设计了实验,其中在无笼设置中记录了 11 只海兰棕色母鸡的叫声(从 168 小时的录音中标记了 4303 次发声)。然后,使用声音处理技术提取每种呼叫类型的特征,并使用 LabVIEW® 软件对发声进行分类。对于特征提取,我们专注于提取 Mel 频率倒谱系数 (MFCC) 和三刺激共振峰 (TF) 特征。然后,应用两种不同的分类器,即反向传播神经网络 (BPNN) 和高斯混合模型 (GMM) 来识别不同的呼叫类型。最后,设计了比较试验来测试不同的识别模型。结果表明,MFCCs-12+BPNN模型(12个变量)的平均准确率最高,为94.9±1.6%,但模型训练时间最长(3201±119ms)。同时,与 MFCCs-12+BPNN 模型相比,MFCCs-3+TF+BPNN 模型具有更少的特征维度(6 个变量)和更少的训练时间(2633 ± 54 ms),并且可以在不影响准确性的情况下进行良好分类(91.4 ± 1.4%)。此外,BPNN 分类器在识别蛋鸡叫声方面优于 GMM 分类器。这种新颖的模型可以以较低的计算成本有效地对鸡的声音进行分类,使其在大数据分析和在线监控系统方面具有相当大的潜力。与 MFCCs-12+BPNN 模型相比,MFCCs-3+TF+BPNN 模型具有更少的特征维度(6 个变量)和更少的训练时间(2633 ± 54 ms),并且可以在不影响准确性的情况下进行良好分类(91.4 ± 1.4%)。此外,BPNN 分类器在识别蛋鸡叫声方面优于 GMM 分类器。这种新颖的模型可以以较低的计算成本有效地对鸡的声音进行分类,使其在大数据分析和在线监控系统方面具有相当大的潜力。与 MFCCs-12+BPNN 模型相比,MFCCs-3+TF+BPNN 模型具有更少的特征维度(6 个变量)和更少的训练时间(2633 ± 54 ms),并且可以在不影响准确性的情况下进行良好分类(91.4 ± 1.4%)。此外,BPNN 分类器在识别蛋鸡叫声方面优于 GMM 分类器。这种新颖的模型可以以较低的计算成本有效地对鸡的声音进行分类,使其在大数据分析和在线监控系统方面具有相当大的潜力。

更新日期:2021-06-16
down
wechat
bug