A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging,ACM Transactions on Asian and Low-Resource Language Information Processing

当前位置： X-MOL 学术 › ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 2 ) Pub Date : 2021-05-26 , DOI: 10.1145/3436818
Cunli Mao ₁ , Zhibo Man ₁ , Zhengtao Yu ₁ , Shengxiang Gao ₁ , Zhenhan Wang ₁ , Hongbin Wang ₁

Affiliation

The smallest semantic unit of the Burmese language is called the syllable. In the present study, it is intended to propose the first neural joint learning model for Burmese syllable segmentation, word segmentation, and part-of-speech ( POS ) tagging with the BERT. The proposed model alleviates the error propagation problem of the syllable segmentation. More specifically, it extends the neural joint model for Vietnamese word segmentation, POS tagging, and dependency parsing [28] with the pre-training method of the Burmese character, syllable, and word embedding with BiLSTM-CRF-based neural layers. In order to evaluate the performance of the proposed model, experiments are carried out on Burmese benchmark datasets, and we fine-tune the model of multilingual BERT. Obtained results show that the proposed joint model can result in an excellent performance.

中文翻译：

用于缅甸音节分割、分词和词性标注的 BERT 神经联合模型

缅甸语最小的语义单位称为音节。在本研究中，旨在提出第一个用于缅甸音节分割、分词和词性(POS) 使用 BERT 进行标记。所提出的模型缓解了音节分割的错误传播问题。更具体地说，它使用基于 BiLSTM-CRF 的神经层的缅甸字符、音节和词嵌入的预训练方法扩展了越南语分词、POS 标记和依赖解析 [28] 的神经联合模型。为了评估所提出模型的性能，我们在缅甸基准数据集上进行了实验，并对多语言 BERT 模型进行了微调。获得的结果表明，所提出的联合模型可以产生出色的性能。

更新日期：2021-05-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>