当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning
arXiv - CS - Computation and Language Pub Date : 2020-03-20 , DOI: arxiv-2003.09288
Suyu Ge, Fangzhao Wu, Chuhan Wu, Tao Qi, Yongfeng Huang, and Xing Xie

Medical named entity recognition (NER) has wide applications in intelligent healthcare. Sufficient labeled data is critical for training accurate medical NER model. However, the labeled data in a single medical platform is usually limited. Although labeled datasets may exist in many different medical platforms, they cannot be directly shared since medical data is highly privacy-sensitive. In this paper, we propose a privacy-preserving medical NER method based on federated learning, which can leverage the labeled data in different platforms to boost the training of medical NER model and remove the need of exchanging raw data among different platforms. Since the labeled data in different platforms usually has some differences in entity type and annotation criteria, instead of constraining different platforms to share the same model, we decompose the medical NER model in each platform into a shared module and a private module. The private module is used to capture the characteristics of the local data in each platform, and is updated using local labeled data. The shared module is learned across different medical platform to capture the shared NER knowledge. Its local gradients from different platforms are aggregated to update the global shared module, which is further delivered to each platform to update their local shared modules. Experiments on three publicly available datasets validate the effectiveness of our method.

中文翻译:

FedNER:具有联邦学习的隐私保护医学命名实体识别

医学命名实体识别(NER)在智能医疗中有着广泛的应用。足够的标记数据对于训练准确的医学 NER 模型至关重要。然而,单个医疗平台中的标记数据通常是有限的。尽管标记数据集可能存在于许多不同的医疗平台中,但由于医疗数据对隐私高度敏感,因此无法直接共享。在本文中,我们提出了一种基于联邦学习的隐私保护医疗 NER 方法,该方法可以利用不同平台的标记数据来促进医疗 NER 模型的训练,并消除不同平台之间交换原始数据的需要。由于不同平台的标注数据通常在实体类型和标注标准上存在一定的差异,而不是限制不同平台共享相同的模型,我们将每个平台中的医疗NER模型分解为一个共享模块和一个私有模块。私有模块用于捕获各个平台本地数据的特征,并使用本地标记数据进行更新。跨不同医疗平台学习共享模块以获取共享的 NER 知识。其来自不同平台的局部梯度被聚合以更新全局共享模块,该全局共享模块进一步传递到各个平台以更新其本地共享模块。在三个公开可用的数据集上的实验验证了我们方法的有效性。跨不同医疗平台学习共享模块以获取共享的 NER 知识。其来自不同平台的局部梯度被聚合以更新全局共享模块,该全局共享模块进一步传递到各个平台以更新其本地共享模块。在三个公开可用的数据集上的实验验证了我们方法的有效性。跨不同医疗平台学习共享模块以获取共享的 NER 知识。其来自不同平台的局部梯度被聚合以更新全局共享模块,该全局共享模块进一步传递到各个平台以更新其本地共享模块。在三个公开可用的数据集上的实验验证了我们方法的有效性。
更新日期:2020-03-26
down
wechat
bug