当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Crackle Detection In Lung Sounds Using Transfer Learning And Multi-Input Convolitional Neural Networks
arXiv - CS - Sound Pub Date : 2021-04-30 , DOI: arxiv-2104.14921
Truc Nguyen, Franz Pernkopf

Large annotated lung sound databases are publicly available and might be used to train algorithms for diagnosis systems. However, it might be a challenge to develop a well-performing algorithm for small non-public data, which have only a few subjects and show differences in recording devices and setup. In this paper, we use transfer learning to tackle the mismatch of the recording setup. This allows us to transfer knowledge from one dataset to another dataset for crackle detection in lung sounds. In particular, a single input convolutional neural network (CNN) model is pre-trained on a source domain using ICBHI 2017, the largest publicly available database of lung sounds. We use log-mel spectrogram features of respiratory cycles of lung sounds. The pre-trained network is used to build a multi-input CNN model, which shares the same network architecture for respiratory cycles and their corresponding respiratory phases. The multi-input model is then fine-tuned on the target domain of our self-collected lung sound database for classifying crackles and normal lung sounds. Our experimental results show significant performance improvements of 9.84% (absolute) in F-score on the target domain using the multi-input CNN model based on transfer learning for crackle detection in adventitious lung sound classification task.

中文翻译:

使用转移学习和多输入卷积神经网络的肺部声音裂纹检测

大型带注释的肺部声音数据库是公开可用的,并且可以用于训练诊断系统的算法。但是,为小型非公开数据开发性能良好的算法可能是一个挑战,该算法只有几个主题,并且在记录设备和设置上显示出差异。在本文中,我们使用转移学习来解决录音设置的不匹配问题。这使我们能够将知识从一个数据集转移到另一数据集,以检测肺音中的裂纹。特别是,使用ICBHI 2017(一个最大的公开肺声音数据库)在源域上对单个输入卷积神经网络(CNN)模型进行了预训练。我们使用肺音呼吸周期的log-mel频谱图特征。预训练的网络用于构建多输入CNN模型,与呼吸周期及其相应呼吸阶段共享相同的网络架构。然后,在我们自收集的肺声数据库的目标域上微调多输入模型,以对crack啪声和正常肺声进行分类。我们的实验结果表明,在不定肺音分类任务中使用基于转移学习的裂纹检测的多输入CNN模型,目标域上F分数的性能显着提高了9.84%(绝对值)。
更新日期:2021-05-03
down
wechat
bug