Core dataset extraction from unlabeled medical big data for lesion localization,Big Data Research

当前位置： X-MOL 学术 › Big Data Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Core dataset extraction from unlabeled medical big data for lesion localization
Big Data Research ( IF 3.3 ) Pub Date : 2021-01-06 , DOI: 10.1016/j.bdr.2021.100185
Kehua Guo , Yifei Wang , Jian Kang , Jian Zhang , Rui Cao

With the advancement of technology in the big data era, the amount of data in the medical field has increased considerably, which has promoted the rapid development of intelligent medical diagnoses. Lesion localization plays an indispensable role in the medical field. However, this approach has not been widely applied in realizing intelligent diagnoses. The effect of lesion localization depends on the training of a convolutional neural network, for which a large amount of medical image data with lesion location labeling is required. Although a considerable amount of medical image data is available, the quality varies and most of the data are not labeled in terms of the lesion location. This labeling process is not only time-consuming and laborious but also requires professional knowledge. To solve this problem and facilitate the development of lesion localization, we propose a novel core dataset extraction architecture, which is a general architecture for extracting the core dataset from unlabeled medical big data for lesion localization. In the architecture, the comprehensive core degree of the images is computed by using three evaluation indicators and an indicator fusion algorithm. In addition, we propose an iterative optimization selection module to enhance the performance of the subsequent batch extraction. The experimental results show that the proposed method only needs to extract 30% of the training data to achieve the training effect of the entire training data, thereby considerably reducing the amount of required human resources.

中文翻译：

从未标记的医学大数据中提取核心数据集以进行病变定位

随着大数据时代技术的进步，医疗领域的数据量大大增加，促进了智能医疗诊断的快速发展。病灶定位在医学领域起着不可或缺的作用。但是，这种方法尚未广泛应用于实现智能诊断。病变定位的效果取决于对卷积神经网络的训练，为此需要大量带有病变位置标记的医学图像数据。尽管可以获得大量的医学图像数据，但是质量会有所不同，并且大多数数据未按病变位置进行标记。该标记过程不仅费时费力，而且需要专业知识。为了解决这个问题并促进病变定位的发展，我们提出了一种新颖的核心数据集提取架构，这是一种从未标记的医学大数据中提取核心数据集进行病变定位的通用架构。在该架构中，通过使用三个评估指标和指标融合算法来计算图像的综合核心程度。此外，我们提出了一个迭代优化选择模块，以增强后续批处理提取的性能。实验结果表明，该方法只需要提取30％的训练数据即可达到整个训练数据的训练效果，从而大大减少了所需的人力资源。这是用于从未标记的医学大数据中提取核心数据集以进行病变定位的通用体系结构。在该架构中，通过使用三个评估指标和指标融合算法来计算图像的综合核心程度。此外，我们提出了一个迭代优化选择模块，以增强后续批处理提取的性能。实验结果表明，该方法只需要提取30％的训练数据即可达到整个训练数据的训练效果，从而大大减少了所需的人力资源。这是用于从未标记的医学大数据中提取核心数据集以进行病变定位的通用体系结构。在该架构中，通过使用三个评估指标和指标融合算法来计算图像的综合核心程度。此外，我们提出了一个迭代优化选择模块，以增强后续批处理提取的性能。实验结果表明，该方法只需要提取30％的训练数据即可达到整个训练数据的训练效果，从而大大减少了所需的人力资源。利用三个评价指标和指标融合算法计算出图像的综合核心度。此外，我们提出了一个迭代优化选择模块，以增强后续批处理提取的性能。实验结果表明，该方法只需要提取30％的训练数据即可达到整个训练数据的训练效果，从而大大减少了所需的人力资源。利用三个评价指标和指标融合算法计算出图像的综合核心度。此外，我们提出了一个迭代优化选择模块，以增强后续批处理提取的性能。实验结果表明，该方法只需要提取30％的训练数据即可达到整个训练数据的训练效果，从而大大减少了所需的人力资源。

更新日期：2021-01-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>