当前位置: X-MOL 学术Artif. Intell. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EAR-UNet: A deep learning-based approach for segmentation of tympanic membranes from otoscopic images
Artificial Intelligence in Medicine ( IF 7.5 ) Pub Date : 2021-04-08 , DOI: 10.1016/j.artmed.2021.102065
Van-Truong Pham , Thi-Thao Tran , Pa-Chun Wang , Po-Yu Chen , Men-Tzung Lo

This paper presents a method for automatic segmentation of tympanic membranes (TMs) from video-otoscopic images based on deep fully convolutional neural network. Built upon the UNet architecture, the proposed EAR scheme is based on three main paradigms: EfficientNet for the encoder, Attention gate for the skip connection path, and Residual blocks for the decoder. The paper also introduces a new loss function term for the neural networks to perform segmentation tasks. Particularly, we propose to integrate EfficientNet-B4 into the encoder part of the UNet. In addition, the decoder part of the proposed network is constructed based on residual blocks from ResNet architecture. By this way, the proposed approach could take advantages of the EfficientNet and ResNet architectures such as preserving efficient reception field size for the model and avoiding overfitting problem. In addition, in the skip connection path, we employ the attention gate that can handle the varieties in shapes and sizes of interested objects, which are common issues in TM regions. Moreover, for network training, we proposed a new loss function term based on the shape distance between predicted and ground truth masks, and exploited the stochastic weight averaging to avoid being trapped in local minima. We evaluate the proposed approach on a TM dataset which includes 1012 otoscopic images from patients diagnosed with and without otitis media. Experimental results show that the proposed approach achieves high segmentation performance with the average Dice similarity coefficient of 0.929, without any pre- or post-processing steps, that outperforms other state-of-the-art methods.



中文翻译:

EAR-UNet:一种基于深度学习的从耳镜图像中分割鼓膜的方法

本文提出了一种基于深度全卷积神经网络从视频耳镜图像中自动分割鼓膜 (TM) 的方法。所提出的 EAR 方案建立在 UNet 架构之上,基于三个主要范式:编码器的 EfficientNet、跳过连接路径的注意门和解码器的残差块。该论文还为神经网络引入了一个新的损失函数项来执行分割任务。特别是,我们建议将 EfficientNet-B4 集成到 UNet 的编码器部分。此外,所提出网络的解码器部分是基于 ResNet 架构的残差块构建的。通过这种方式,所提出的方法可以利用 EfficientNet 和 ResNet 架构的优势,例如为模型保留有效的接收场大小并避免过度拟合问题。此外,在跳过连接路径中,我们采用了注意力门,可以处理感兴趣对象的形状和大小的多样性,这是 TM 区域中的常见问题。此外,对于网络训练,我们基于预测和真实掩码之间的形状距离提出了一个新的损失函数项,并利用随机权重平均来避免陷入局部最小值。我们在 TM 数据集上评估所提出的方法,该数据集包括 1012 个诊断为患有和未患有中耳炎的患者的耳镜图像。

更新日期:2021-04-18
down
wechat
bug