当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Chemlistem: chemical named entity recognition using recurrent neural networks.
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2018-12-06 , DOI: 10.1186/s13321-018-0313-8
Peter Corbett 1 , John Boyle 1
Affiliation  

Chemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as “deep learning” we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks—a type of recurrent neural net. The second system eschews the rich feature set—and even tokenisation—in favour of character labelling using neural character embeddings and multiple LSTM layers. The third system is an ensemble that combines the results of the first two systems. Our original BioCreative V.5 competition entry was placed in the top group with the highest F scores, and subsequent using transfer learning have achieved a final F score of 90.33% on the test data (precision 91.47%, recall 89.21%).

中文翻译:

Chemlistem:使用递归神经网络的化学命名实体识别。

传统上,化学命名实体识别(NER)被基于条件随机场(CRF)的方法所控制,但是鉴于人工神经网络技术(称为“深度学习”)的成功,我们决定对其进行研究以替代CRF。我们在这里介绍了几种化学命名实体识别系统。第一个系统使用丰富的按令牌功能和神经词嵌入功能,将基于CRF的传统习语转换为深度学习框架,并使用双向长期短期记忆(LSTM)网络(一种递归神经网络)生成一系列标签。 。第二个系统避开了丰富的功能集,甚至避免了标记化,转而使用神经字符嵌入和多个LSTM层进行字符标记。第三个系统是将前两个系统的结果组合在一起的集合。
更新日期:2018-12-06
down
wechat
bug