当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Improved feature extraction for CRNN-based multiple sound source localization
arXiv - CS - Sound Pub Date : 2021-05-05 , DOI: arxiv-2105.01897 Pierre-Amaury Grumiaux, Srdan Kitic, Laurent Girin, Alexandre Guérin
arXiv - CS - Sound Pub Date : 2021-05-05 , DOI: arxiv-2105.01897 Pierre-Amaury Grumiaux, Srdan Kitic, Laurent Girin, Alexandre Guérin
In this work, we propose to extend a state-of-the-art multi-source
localization system based on a convolutional recurrent neural network and
Ambisonics signals. We significantly improve the performance of the baseline
network by changing the layout between convolutional and pooling layers. We
propose several configurations with more convolutional layers and smaller
pooling sizes in-between, so that less information is lost across the layers,
leading to a better feature extraction. In parallel, we test the system's
ability to localize up to 3 sources, in which case the improved feature
extraction provides the most significant boost in accuracy. We evaluate and
compare these improved configurations on synthetic and real-world data. The
obtained results show a quite substantial improvement of the multiple sound
source localization performance over the baseline network.
中文翻译:
基于CRNN的多声源定位的改进特征提取
在这项工作中,我们建议基于卷积递归神经网络和Ambisonics信号扩展最先进的多源定位系统。通过更改卷积和池化层之间的布局,我们显着提高了基准网络的性能。我们提出了几种配置,它们之间具有更多的卷积层和较小的池大小,以便在各层之间丢失较少的信息,从而实现更好的特征提取。同时,我们测试了系统最多可定位3个源的能力,在这种情况下,改进的特征提取可最大程度地提高准确性。我们评估和比较这些在合成数据和实际数据上的改进配置。
更新日期:2021-05-06
中文翻译:
基于CRNN的多声源定位的改进特征提取
在这项工作中,我们建议基于卷积递归神经网络和Ambisonics信号扩展最先进的多源定位系统。通过更改卷积和池化层之间的布局,我们显着提高了基准网络的性能。我们提出了几种配置,它们之间具有更多的卷积层和较小的池大小,以便在各层之间丢失较少的信息,从而实现更好的特征提取。同时,我们测试了系统最多可定位3个源的能力,在这种情况下,改进的特征提取可最大程度地提高准确性。我们评估和比较这些在合成数据和实际数据上的改进配置。