research-article

Neural Dependency Parser for Tibetan Sentences

Authors:
Bo An

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

,
Congjun Long

Chinese Academy of Social Sciences, Beijing, China

Chinese Academy of Social Sciences, Beijing, China
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20 Issue 2Article No.: 30pp 1–16https://doi.org/10.1145/3429456

Published:15 March 2021Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

The research of Tibetan dependency analysis is mainly limited to two challenges: lack of a dataset and reliance on expert knowledge. To resolve the preceding challenges, we first introduce a new Tibetan dependency analysis dataset, and then propose a neural-based framework that resolves the reliance on the expert knowledge issue by automatically extracting feature vectors of words and predicts their head words and type of dependency arcs. Specifically, we convert the words in the sentence into distributional vectors and employ a sequence to vector network to extract feature words. Furthermore, we introduce a head classifier and type classifier to predict the head word and type of dependency arc, respectively. Experiments demonstrate that our model achieves promising performance on the Tibetan dependency analysis task.

References

Daniel Andor, Chris Alberti, David J. Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, and Michael Collins. 2016. Globally normalized transition-based neural networks. arXiv:1603.06042Google Scholar
Yonatan Belinkov and James Glass. 2019. Analysis methods in neural language processing: A survey. Transactions of the Association for Computational Linguistics 7 (2019), 49–72.Google ScholarCross Ref
Qingxing Cao, Xiaodan Liang, Bailin Li, and Liang Lin. 2019. Interpretable visual question answering by reasoning on dependency trees. IEEE Transactions on Pattern Analysis and Machine Intelligence. Early access, September 24, 2019.Google Scholar
Danqi Chen and Christopher D. Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 740–750.Google Scholar
Mengzhu Chen, Shengjie Zhao, and Kai Yang. 2017. Neural architecture for Tibetan word segmentation. In Proceedings of the International Conference on Asian Language Processing. 367–370.Google ScholarCross Ref
Wenliang Chen, Zhenghua Li, and Min Zhang. 2014. Dependency parsing: Past, present, and future. In Proceedings of COLING 2014, the 25th International Conference on Computational Lingistics: Tutorial Abstracts. 14–16.Google Scholar
Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, and Li Deng. 2016b. Bi-directional attention with agreement for dependency parsing. arXiv:1608.02076Google Scholar
Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016a. Long short-term memory-networks for machine reading. arXiv:1601.06733Google Scholar
Ronan Collobert. 2011. Deep learning for efficient discriminative parsing. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11). 224–232.Google Scholar
Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel P. Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12 (2011), 2493–2537. Google ScholarDigital Library
Marie-Catherine De Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, and Christopher D. Manning. 2014. Universal Stanford dependencies: A cross-linguistic typology. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). 4585–4592.Google Scholar
Timothy Dozat and Christopher D. Manning. 2016. Deep biaffine attention for neural dependency parsing. arXiv:1611.01734Google Scholar
Timothy Dozat, Peng Qi, and Christopher D. Manning. 2017. Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. 20–30.Google Scholar
Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith. 2015. Transition-based dependency parsing with stack long short-term memory. arXiv:1505.08075Google Scholar
Christian Faggionato and Marieke Meelen. 2019. Developing the old Tibetan treebank. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP’19). 304–312. DOI:https://doi.org/10.26615/978-954-452-056-4_035Google ScholarCross Ref
Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, and Larry Heck. 2016. Contextual LSTM (CLSTM) models for large scale NLP tasks. arXiv:1602.06291Google Scholar
Yoav Goldberg and Graeme Hirst. 2017. Neural Network Methods in Natural Language Processing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool. Google ScholarDigital Library
Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning word vectors for 157 languages. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’18).Google Scholar
Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. 2016. A joint many-task model: Growing a neural network for multiple NLP tasks. arXiv:1611.01587Google Scholar
Xiangzhen He, Yachao Li, Ma Ning, and Hongzhi Yu. 2015. Study on Tibetan automatic word segmentation as syllable tagging. Application Research of Computers 32 (2015), 200–211.Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780. Google ScholarDigital Library
Quecairang Hua, Wenbing Jiang, Haixing Zhao, and Qun Liu. 2013. Semi-automatic building Tibetan treebank based on word-pair dependency classification. Journal of Chinese Information Processing 27 (2013).Google Scholar
Quecairang Hua, Qun Liu, and Haixing Zhao. 2014. Discriminative Tibetan part-of-speech tagging with perceptron model. Journal of Chinese Information Processing 28, 2 (2014), 56–60.Google Scholar
Zhanming Jie, Aldrian Obaja Muis, and Wei Lu. 2017. Efficient dependency-guided named entity recognition. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. Google ScholarDigital Library
Caijun Kang, Di Jiang, and Congjun Long. 2013. Tibetan word segmentation based on word-position tagging. In Proceedings of the International Conference on Asian Language Processing. 239–242. Google ScholarDigital Library
Eliyahu Kiperwasser and Yoav Goldberg. 2016. Simple and accurate dependency parsing using bidirectional LSTM feature representations. Transactions of the Association for Computational Linguistics 4 (2016), 313–327.Google ScholarCross Ref
Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, and Noah A. Smith. 2016. What do recurrent neural network grammars learn about syntax. arXiv:1611.05774Google Scholar
Shuhei Kurita, Daisuke Kawahara, and Sadao Kurohashi. 2017. Neural joint model for transition-based Chinese syntactic analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1204–1214.Google ScholarCross Ref
Phong Le and Willem Zuidema. 2015. Compositional distributional semantics with long short term memory. arXiv:1503.02510Google Scholar
Sisheng Liang, Long Nguyen, and Fang Jin. 2018. A multi-variable stacked long-short term memory network for wind speed forecasting. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data’18). 4561–4564.Google ScholarCross Ref
Haitao Liu and Wei Huang. 2006. A Chinese dependency syntax for treebanking. In Proceedings of the 20th Pacific Asia Conference on Language, Information, and Computation. 126–133.Google Scholar
Huidan Liu, Mingua Nuo, Weina Zhao, Jian Wu, and Yeping He. 2012. SegT: A practical tibetan word segmentation system. Journal of Chinese Information Processing 26, 1 (2012), 97–104.Google Scholar
Congjun Long and Lin Li. 2016. Research on Tibetan semantic role labeling using an integrated strategy. Himalayan Linguistics 15, 1 (2016), 113–125.Google ScholarCross Ref
Congjun Long, Huidan Liu, Nuo Minghua, and W. U. Jian. 2015. Tibetan POS tagging based on syllable tagging. Journal of Chinese Information Processing 29, 5 (2015), 211–216.Google Scholar
Marieke Meelen and Nathan Hill. 2017. Segmenting and POS tagging classical Tibetan using a memory-based tagger. UC Santa Barbara Himalayan Linguistics 16 (2917), 64–89.Google Scholar
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 2010 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10).Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111–3119. Google ScholarDigital Library
Wenzhe Pei, Tao Ge, and Baobao Chang. 2015. An effective neural network model for graph-based dependency parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 313–322.Google ScholarCross Ref
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365Google Scholar
Lirong Qiu, Congjun Long, and Xiaobing Zhao. 2012. A joint approach for building a large Tibetan corpus with syntactic parsing and semantic role labeling. In Proceedings of the 2012 5th International Conference on Intelligent Networks and Intelligent Systems. 232–235. Google ScholarDigital Library
Quecairang Hua and Hai Xing Zhao. 2013. Tibetan text dependency syntactic analysis based on discriminant. Computer Engineering 39, 4 (2013), 300–304.Google Scholar
Jane J. Robinson. 1970. Dependency structures and transformational rules. Language 46, 2 (1970), 259–285.Google ScholarCross Ref
Tal Schuster, Ori Ram, Regina Barzilay, and Amir Globerson. 2019. Cross-lingual alignment of contextual word embeddings, with applications to zero-shot dependency parsing. arXiv:1902.09492Google Scholar
Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. 2011. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 151–161. Google ScholarDigital Library
Qiang Sun and Yanwei Fu. 2019. Stacked self-attention networks for visual question answering. In Proceedings of the International Conference on Multimedia Retrieval. 207–211. Google ScholarDigital Library
Weiwei Sun, Junjie Cao, and Xiaojun Wan. 2017. Semantic dependency parsing via book embedding. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 828–838.Google ScholarCross Ref
Yuan Sun, Chaofan Chen, Tianci Xia, and Xiaobing Zhao. 2019a. QuGAN: Quasi generative adversarial network for Tibetan question answering corpus generation. IEEE Access 7 (2019), 116247–116255.Google ScholarCross Ref
Yuan Sun, Like Wang, Chaofan Chen, Tianci Xia, and Xiaobing Zhao. 2019b. Improved distant supervised model in Tibetan relation extraction using ELMo and attention. IEEE Access 7 (2019), 173054–173062.Google ScholarCross Ref
Duyu Tang, Bing Qin, and Ting Liu. 2015. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1422–1432.Google ScholarCross Ref
Tashi Gyal and Duo La. 2015. Theory and method of Tibetan dependency treebank construction. Journal of Tibet University 2 (2015), 76–83.Google Scholar
Lucien Tesnière. 1959. Eléments de Syntaxe Structurale. John Benjamins Publishing Company.Google Scholar
Zhaxi Nima, Cairang Toudan, and Zhaxi Wanme. 2018. Study on the technique of Tibetan dependence treebank building. Plateau Science Research 2, 03 (2018), 103–109.Google Scholar
Wuji Xia and Quecairang Hua. 2019. Dependency tree based Tibetan semantic dependency analysis. Journal of Tsinghua University (Science and Technology) 59, 9 (2019), 750–756.Google Scholar
Zou Xiaomei, Yang Jing, Zhang Jianpei, and Han Hongyu. 2018. Microblog sentiment analysis with weak dependency connections. Knowledge-Based Systems 142 (2018), 170–180.Google ScholarDigital Library
Xingxing Zhang, Jianpeng Cheng, and Mirella Lapata. 2016. Dependency parsing as head selection. arXiv:1606.01280Google Scholar

Index Terms

Neural Dependency Parser for Tibetan Sentences
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Dependency Parser for Telugu Language
ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies

In Telugu language sentence if we change the word order its meaning was not changed whereas in English if we change the word order the meaning was changed. So Telugu is morphologically rich so it is very difficult to develop syntactic parsers for these ...
Read More
Sentence Boundary Disambiguation for Tibetan Based on Attention Mechanism at the Syllable Level
Tibetan is a low-resource language with few existing electronic reference materials. The goal of Tibetan sentence boundary disambiguation (SBD) is to segment long text into sentences, and it is the foundation for downstream tasks corpora building. This ...
Read More
Revisiting Tibetan Word Segmentation with Neural Networks
Chinese Lexical Semantics
Abstract
Tibetan Word Segmentation is a basic and essential task in Tibetan Natural Language Processing workflow. Performance of TWS can directly affect many other downstream Tibetan NLP tasks since errors propagate in a multi-stage NLP pipeline. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20, Issue 2
March 2021
313 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3454116
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 March 2021
- Accepted: 1 October 2020
- Revised: 1 July 2020
- Received: 1 January 2020
Published in tallip Volume 20, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Tibetan dependency analysis
neural networks
recurrent neural network
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 83
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Neural Dependency Parser for Tibetan Sentences

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Dependency Parser for Telugu Language

Sentence Boundary Disambiguation for Tibetan Based on Attention Mechanism at the Syllable Level

Revisiting Tibetan Word Segmentation with Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Neural Dependency Parser for Tibetan Sentences

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Dependency Parser for Telugu Language

Sentence Boundary Disambiguation for Tibetan Based on Attention Mechanism at the Syllable Level

Revisiting Tibetan Word Segmentation with Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media