Generative artificial intelligence: synthetic datasets in dentistry,BDJ Open

当前位置： X-MOL 学术 › BDJ Open › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Generative artificial intelligence: synthetic datasets in dentistry
BDJ Open Pub Date : 2024-03-01 , DOI: 10.1038/s41405-024-00198-4
Fahad Umer , Niha Adnan

Introduction

Artificial Intelligence (AI) algorithms, particularly Deep Learning (DL) models are known to be data intensive. This has increased the demand for digital data in all domains of healthcare, including dentistry. The main hindrance in the progress of AI is access to diverse datasets which train DL models ensuring optimal performance, comparable to subject experts. However, administration of these traditionally acquired datasets is challenging due to privacy regulations and the extensive manual annotation required by subject experts. Biases such as ethical, socioeconomic and class imbalances are also incorporated during the curation of these datasets, limiting their overall generalizability. These challenges prevent their accrual at a larger scale for training DL models.

Methods

Generative AI techniques can be useful in the production of Synthetic Datasets (SDs) that can overcome issues affecting traditionally acquired datasets. Variational autoencoders, generative adversarial networks and diffusion models have been used to generate SDs. The following text is a review of these generative AI techniques and their operations. It discusses the chances of SDs and challenges with potential solutions which will improve the understanding of healthcare professionals working in AI research.

Conclusion

Synthetic data customized to the need of researchers can be produced to train robust AI models. These models, having been trained on such a diverse dataset will be applicable for dissemination across countries. However, there is a need for the limitations associated with SDs to be better understood, and attempts made to overcome those concerns prior to their widespread use.

中文翻译：

生成人工智能：牙科合成数据集

介绍

众所周知，人工智能 (AI) 算法，特别是深度学习 (DL) 模型是数据密集型的。这增加了包括牙科在内的所有医疗保健领域对数字数据的需求。人工智能进步的主要障碍是访问不同的数据集，这些数据集训练深度学习模型，确保与学科专家相媲美的最佳性能。然而，由于隐私法规和学科专家需要大量的手动注释，管理这些传统获取的数据集具有挑战性。在这些数据集的管理过程中，道德、社会经济和阶级失衡等偏见也被纳入其中，限制了它们的整体普遍性。这些挑战阻碍了它们在训练深度学习模型方面的更大规模的积累。