当前位置: X-MOL 学术J. Dent. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Benchmarking Deep Learning Models for Tooth Structure Segmentation
Journal of Dental Research ( IF 7.6 ) Pub Date : 2022-06-09 , DOI: 10.1177/00220345221100169
L Schneider 1, 2 , L Arsiwala-Scheppach 1, 2 , J Krois 1, 2 , H Meyer-Lueckel 3 , K K Bressem 4, 5 , S M Niehues 4 , F Schwendicke 1, 2
Affiliation  

A wide range of deep learning (DL) architectures with varying depths are available, with developers usually choosing one or a few of them for their specific task in a nonsystematic way. Benchmarking (i.e., the systematic comparison of state-of-the art architectures on a specific task) may provide guidance in the model development process and may allow developers to make better decisions. However, comprehensive benchmarking has not been performed in dentistry yet. We aimed to benchmark a range of architecture designs for 1 specific, exemplary case: tooth structure segmentation on dental bitewing radiographs. We built 72 models for tooth structure (enamel, dentin, pulp, fillings, crowns) segmentation by combining 6 different DL network architectures (U-Net, U-Net++, Feature Pyramid Networks, LinkNet, Pyramid Scene Parsing Network, Mask Attention Network) with 12 encoders from 3 different encoder families (ResNet, VGG, DenseNet) of varying depth (e.g., VGG13, VGG16, VGG19). On each model design, 3 initialization strategies (ImageNet, CheXpert, random initialization) were applied, resulting overall into 216 trained models, which were trained up to 200 epochs with the Adam optimizer (learning rate = 0.0001) and a batch size of 32. Our data set consisted of 1,625 human-annotated dental bitewing radiographs. We used a 5-fold cross-validation scheme and quantified model performances primarily by the F1-score. Initialization with ImageNet or CheXpert weights significantly outperformed random initialization (P < 0.05). Deeper and more complex models did not necessarily perform better than less complex alternatives. VGG-based models were more robust across model configurations, while more complex models (e.g., from the ResNet family) achieved peak performances. In conclusion, initializing models with pretrained weights may be recommended when training models for dental radiographic analysis. Less complex model architectures may be competitive alternatives if computational resources and training time are restricting factors. Models developed and found superior on nondental data sets may not show this behavior for dental domain-specific tasks.



中文翻译:

对牙齿结构分割的深度学习模型进行基准测试

可以使用各种深度不同的深度学习 (DL) 架构,开发人员通常以非系统的方式为他们的特定任务选择其中的一个或几个。基准测试(即,对特定任务的最新架构进行系统比较)可以在模型开发过程中提供指导,并可以让开发人员做出更好的决策。然而,尚未在牙科领域进行全面的基准测试。我们的目标是针对 1 个特定的示例性案例对一系列架构设计进行基准测试:牙齿咬翼 X 光片上的牙齿结构分割。我们通过结合 6 种不同的深度学习网络架构(U-Net、U-Net++、特征金字塔网络、LinkNet、金字塔场景解析网络、Mask Attention Network)具有来自 3 个不同深度的编码器系列(ResNet、VGG、DenseNet)的 12 个编码器(例如,VGG13、VGG16、VGG19)。在每个模型设计中,应用了 3 种初始化策略(ImageNet、CheXpert、随机初始化),总共产生 216 个训练模型,这些模型使用 Adam 优化器(学习率 = 0.0001)和 32 批大小训练了多达 200 个 epoch。我们的数据集包含 1,625 张人工注释的牙齿咬翼片。我们使用了 5 折交叉验证方案并主要通过 F1 分数来量化模型性能。使用 ImageNet 或 CheXpert 权重的初始化明显优于随机初始化(应用了 3 种初始化策略(ImageNet、CheXpert、随机初始化),总共得到 216 个训练模型,这些模型使用 Adam 优化器(学习率 = 0.0001)和 32 批大小训练了多达 200 个 epoch。我们的数据集包括1,625 张人工注释的牙齿咬翼片。我们使用了 5 折交叉验证方案并主要通过 F1 分数来量化模型性能。使用 ImageNet 或 CheXpert 权重的初始化明显优于随机初始化(应用了 3 种初始化策略(ImageNet、CheXpert、随机初始化),总共得到 216 个训练模型,这些模型使用 Adam 优化器(学习率 = 0.0001)和 32 批大小训练了多达 200 个 epoch。我们的数据集包括1,625 张人工注释的牙齿咬翼片。我们使用了 5 折交叉验证方案并主要通过 F1 分数来量化模型性能。使用 ImageNet 或 CheXpert 权重的初始化明显优于随机初始化(我们使用了 5 折交叉验证方案并主要通过 F1 分数来量化模型性能。使用 ImageNet 或 CheXpert 权重的初始化明显优于随机初始化(我们使用了 5 折交叉验证方案并主要通过 F1 分数来量化模型性能。使用 ImageNet 或 CheXpert 权重的初始化明显优于随机初始化(P < 0.05)。更深和更复杂的模型不一定比不太复杂的替代品表现更好。基于 VGG 的模型在模型配置中更加稳健,而更复杂的模型(例如,来自 ResNet 系列)实现了峰值性能。总之,在训练用于牙科射线照相分析的模型时,可能建议使用预训练权重初始化模型。如果计算资源和训练时间是限制因素,不太复杂的模型架构可能是具有竞争力的替代方案。在非牙科数据集上开发和发现的模型可能不会在牙科领域特定的任务中显示这种行为。

更新日期:2022-06-09
down
wechat
bug