当前位置: X-MOL 学术J. Sign. Process. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint Big Data Extraction Method for Coal Mine Safety with Characters and Words Fusion
Journal of Signal Processing Systems ( IF 1.8 ) Pub Date : 2022-06-21 , DOI: 10.1007/s11265-022-01778-z
Faguo Zhou , Chao Wang , Dongxue Sun , Yanan Song

The entity relation extraction of coal mine safety accidents is of great significance to the supervision and prevention of coal mine safety accidents. Aiming at the entity relation joint extraction of coal mine safety accidents, this paper proposes an entity relation joint extraction model based on multi-heads attention and characters and words fusion. On the basis of using the Chinese pre-training models RoBERTa-wwm-ext and Word2vec to generate character vectors and word vectors respectively, the multi-heads self-attention mechanism is used to assign more weights to related word vectors, and obtain the dependencies between distant entities and the semantic relations of entities from different perspectives. Then character vectors and word vectors are spliced into BiLSTM, and finally the label sequence is output by Conditional Random Field (CRF). In the field of entity relation extraction, the overlapping entity relation extraction and reconciliation of coal mine safety accidents are more difficult. The harmonic mean value is 93.19\(\%\), and the effect of extracting the entity relation of coal mine safety accidents has been significantly improved. The overall harmonic mean value is 94.54\(\%\), which is better than the comparison models.



中文翻译:

字词融合的煤矿安全联合大数据提取方法

煤矿安全事故的实体关系抽取对于煤矿安全事故的监督和预防具有重要意义。针对煤矿安全事故实体关系联合抽取,提出一种基于多头注意力和字词融合的实体关系联合抽取模型。在使用中文预训练模型 RoBERTa-wwm-ext 和 Word2vec 分别生成字符向量和词向量的基础上,利用多头自注意力机制为相关词向量分配更多的权重,并获得依赖关系远距离实体之间的关系以及不同角度的实体之间的语义关系。然后将字符向量和词向量拼接成BiLSTM,最后输出标签序列条件随机场(CRF)。在实体关系抽取领域,重叠实体关系抽取与煤矿安全事故协调难度较大。调和平均值为93.19 \(\%\),煤矿安全事故实体关系提取效果显着提高。整体调和平均值为 94.54 \(\%\),优于对比模型。

更新日期:2022-06-22
down
wechat
bug