当前位置: X-MOL 学术Aut. Control Comp. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Method for Identifying Local Drug Names in Xinjiang Based on BERT-BiLSTM-CRF
Automatic Control and Computer Sciences ( IF 0.6 ) Pub Date : 2020-07-15 , DOI: 10.3103/s0146411620030098
Yuhang Song , Shengwei Tian , Long Yu

Abstract

This paper proposes a BERT-BiLSTM-CRF Xinjiang local drug name recognition method embedded in the BERT (Bidirectional Encoder Representations from Transformers) pre-training language model. The method is pre-trained by the two-way Transformer structure. The training method of MaskLM is used to randomly select some Chinese characters of the input sequence to be replaced with special symbols. The word vector is dynamically generated according to the position information of Chinese characters in Xinjiang local drug names, and then the word vector sequence is input into two directions. The LSTM layer is trained to obtain the dependencies between the sequences. Finally, the CRF module takes the joint distribution probability of the entire marker sequence as the output, and obtains the global optimal test result. The model obtains the named entity recognition on the Xinjiang local drug corpus. The accuracy rate is 95.77%, the recall rate is 89.47%, and the F value is 92.52%. The experimental results show that BERT-BiLSTM-CRF can effectively improve the evaluation indexes of Xinjiang local drug name identification methods in practical applications.
更新日期:2020-07-15
down
wechat
bug