skip to main content
research-article

Neural Dependency Parser for Tibetan Sentences

Authors Info & Claims
Published:15 March 2021Publication History
Skip Abstract Section

Abstract

The research of Tibetan dependency analysis is mainly limited to two challenges: lack of a dataset and reliance on expert knowledge. To resolve the preceding challenges, we first introduce a new Tibetan dependency analysis dataset, and then propose a neural-based framework that resolves the reliance on the expert knowledge issue by automatically extracting feature vectors of words and predicts their head words and type of dependency arcs. Specifically, we convert the words in the sentence into distributional vectors and employ a sequence to vector network to extract feature words. Furthermore, we introduce a head classifier and type classifier to predict the head word and type of dependency arc, respectively. Experiments demonstrate that our model achieves promising performance on the Tibetan dependency analysis task.

References

  1. Daniel Andor, Chris Alberti, David J. Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, and Michael Collins. 2016. Globally normalized transition-based neural networks. arXiv:1603.06042Google ScholarGoogle Scholar
  2. Yonatan Belinkov and James Glass. 2019. Analysis methods in neural language processing: A survey. Transactions of the Association for Computational Linguistics 7 (2019), 49–72.Google ScholarGoogle ScholarCross RefCross Ref
  3. Qingxing Cao, Xiaodan Liang, Bailin Li, and Liang Lin. 2019. Interpretable visual question answering by reasoning on dependency trees. IEEE Transactions on Pattern Analysis and Machine Intelligence. Early access, September 24, 2019.Google ScholarGoogle Scholar
  4. Danqi Chen and Christopher D. Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 740–750.Google ScholarGoogle Scholar
  5. Mengzhu Chen, Shengjie Zhao, and Kai Yang. 2017. Neural architecture for Tibetan word segmentation. In Proceedings of the International Conference on Asian Language Processing. 367–370.Google ScholarGoogle ScholarCross RefCross Ref
  6. Wenliang Chen, Zhenghua Li, and Min Zhang. 2014. Dependency parsing: Past, present, and future. In Proceedings of COLING 2014, the 25th International Conference on Computational Lingistics: Tutorial Abstracts. 14–16.Google ScholarGoogle Scholar
  7. Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, and Li Deng. 2016b. Bi-directional attention with agreement for dependency parsing. arXiv:1608.02076Google ScholarGoogle Scholar
  8. Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016a. Long short-term memory-networks for machine reading. arXiv:1601.06733Google ScholarGoogle Scholar
  9. Ronan Collobert. 2011. Deep learning for efficient discriminative parsing. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11). 224–232.Google ScholarGoogle Scholar
  10. Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel P. Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12 (2011), 2493–2537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Marie-Catherine De Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, and Christopher D. Manning. 2014. Universal Stanford dependencies: A cross-linguistic typology. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). 4585–4592.Google ScholarGoogle Scholar
  12. Timothy Dozat and Christopher D. Manning. 2016. Deep biaffine attention for neural dependency parsing. arXiv:1611.01734Google ScholarGoogle Scholar
  13. Timothy Dozat, Peng Qi, and Christopher D. Manning. 2017. Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. 20–30.Google ScholarGoogle Scholar
  14. Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith. 2015. Transition-based dependency parsing with stack long short-term memory. arXiv:1505.08075Google ScholarGoogle Scholar
  15. Christian Faggionato and Marieke Meelen. 2019. Developing the old Tibetan treebank. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP’19). 304–312. DOI:https://doi.org/10.26615/978-954-452-056-4_035Google ScholarGoogle ScholarCross RefCross Ref
  16. Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, and Larry Heck. 2016. Contextual LSTM (CLSTM) models for large scale NLP tasks. arXiv:1602.06291Google ScholarGoogle Scholar
  17. Yoav Goldberg and Graeme Hirst. 2017. Neural Network Methods in Natural Language Processing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning word vectors for 157 languages. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’18).Google ScholarGoogle Scholar
  19. Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. 2016. A joint many-task model: Growing a neural network for multiple NLP tasks. arXiv:1611.01587Google ScholarGoogle Scholar
  20. Xiangzhen He, Yachao Li, Ma Ning, and Hongzhi Yu. 2015. Study on Tibetan automatic word segmentation as syllable tagging. Application Research of Computers 32 (2015), 200–211.Google ScholarGoogle Scholar
  21. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Quecairang Hua, Wenbing Jiang, Haixing Zhao, and Qun Liu. 2013. Semi-automatic building Tibetan treebank based on word-pair dependency classification. Journal of Chinese Information Processing 27 (2013).Google ScholarGoogle Scholar
  23. Quecairang Hua, Qun Liu, and Haixing Zhao. 2014. Discriminative Tibetan part-of-speech tagging with perceptron model. Journal of Chinese Information Processing 28, 2 (2014), 56–60.Google ScholarGoogle Scholar
  24. Zhanming Jie, Aldrian Obaja Muis, and Wei Lu. 2017. Efficient dependency-guided named entity recognition. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Caijun Kang, Di Jiang, and Congjun Long. 2013. Tibetan word segmentation based on word-position tagging. In Proceedings of the International Conference on Asian Language Processing. 239–242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Eliyahu Kiperwasser and Yoav Goldberg. 2016. Simple and accurate dependency parsing using bidirectional LSTM feature representations. Transactions of the Association for Computational Linguistics 4 (2016), 313–327.Google ScholarGoogle ScholarCross RefCross Ref
  27. Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, and Noah A. Smith. 2016. What do recurrent neural network grammars learn about syntax. arXiv:1611.05774Google ScholarGoogle Scholar
  28. Shuhei Kurita, Daisuke Kawahara, and Sadao Kurohashi. 2017. Neural joint model for transition-based Chinese syntactic analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1204–1214.Google ScholarGoogle ScholarCross RefCross Ref
  29. Phong Le and Willem Zuidema. 2015. Compositional distributional semantics with long short term memory. arXiv:1503.02510Google ScholarGoogle Scholar
  30. Sisheng Liang, Long Nguyen, and Fang Jin. 2018. A multi-variable stacked long-short term memory network for wind speed forecasting. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data’18). 4561–4564.Google ScholarGoogle ScholarCross RefCross Ref
  31. Haitao Liu and Wei Huang. 2006. A Chinese dependency syntax for treebanking. In Proceedings of the 20th Pacific Asia Conference on Language, Information, and Computation. 126–133.Google ScholarGoogle Scholar
  32. Huidan Liu, Mingua Nuo, Weina Zhao, Jian Wu, and Yeping He. 2012. SegT: A practical tibetan word segmentation system. Journal of Chinese Information Processing 26, 1 (2012), 97–104.Google ScholarGoogle Scholar
  33. Congjun Long and Lin Li. 2016. Research on Tibetan semantic role labeling using an integrated strategy. Himalayan Linguistics 15, 1 (2016), 113–125.Google ScholarGoogle ScholarCross RefCross Ref
  34. Congjun Long, Huidan Liu, Nuo Minghua, and W. U. Jian. 2015. Tibetan POS tagging based on syllable tagging. Journal of Chinese Information Processing 29, 5 (2015), 211–216.Google ScholarGoogle Scholar
  35. Marieke Meelen and Nathan Hill. 2017. Segmenting and POS tagging classical Tibetan using a memory-based tagger. UC Santa Barbara Himalayan Linguistics 16 (2917), 64–89.Google ScholarGoogle Scholar
  36. Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 2010 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10).Google ScholarGoogle Scholar
  37. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111–3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wenzhe Pei, Tao Ge, and Baobao Chang. 2015. An effective neural network model for graph-based dependency parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 313–322.Google ScholarGoogle ScholarCross RefCross Ref
  39. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365Google ScholarGoogle Scholar
  40. Lirong Qiu, Congjun Long, and Xiaobing Zhao. 2012. A joint approach for building a large Tibetan corpus with syntactic parsing and semantic role labeling. In Proceedings of the 2012 5th International Conference on Intelligent Networks and Intelligent Systems. 232–235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Quecairang Hua and Hai Xing Zhao. 2013. Tibetan text dependency syntactic analysis based on discriminant. Computer Engineering 39, 4 (2013), 300–304.Google ScholarGoogle Scholar
  42. Jane J. Robinson. 1970. Dependency structures and transformational rules. Language 46, 2 (1970), 259–285.Google ScholarGoogle ScholarCross RefCross Ref
  43. Tal Schuster, Ori Ram, Regina Barzilay, and Amir Globerson. 2019. Cross-lingual alignment of contextual word embeddings, with applications to zero-shot dependency parsing. arXiv:1902.09492Google ScholarGoogle Scholar
  44. Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. 2011. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 151–161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Qiang Sun and Yanwei Fu. 2019. Stacked self-attention networks for visual question answering. In Proceedings of the International Conference on Multimedia Retrieval. 207–211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Weiwei Sun, Junjie Cao, and Xiaojun Wan. 2017. Semantic dependency parsing via book embedding. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 828–838.Google ScholarGoogle ScholarCross RefCross Ref
  47. Yuan Sun, Chaofan Chen, Tianci Xia, and Xiaobing Zhao. 2019a. QuGAN: Quasi generative adversarial network for Tibetan question answering corpus generation. IEEE Access 7 (2019), 116247–116255.Google ScholarGoogle ScholarCross RefCross Ref
  48. Yuan Sun, Like Wang, Chaofan Chen, Tianci Xia, and Xiaobing Zhao. 2019b. Improved distant supervised model in Tibetan relation extraction using ELMo and attention. IEEE Access 7 (2019), 173054–173062.Google ScholarGoogle ScholarCross RefCross Ref
  49. Duyu Tang, Bing Qin, and Ting Liu. 2015. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1422–1432.Google ScholarGoogle ScholarCross RefCross Ref
  50. Tashi Gyal and Duo La. 2015. Theory and method of Tibetan dependency treebank construction. Journal of Tibet University 2 (2015), 76–83.Google ScholarGoogle Scholar
  51. Lucien Tesnière. 1959. Eléments de Syntaxe Structurale. John Benjamins Publishing Company.Google ScholarGoogle Scholar
  52. Zhaxi Nima, Cairang Toudan, and Zhaxi Wanme. 2018. Study on the technique of Tibetan dependence treebank building. Plateau Science Research 2, 03 (2018), 103–109.Google ScholarGoogle Scholar
  53. Wuji Xia and Quecairang Hua. 2019. Dependency tree based Tibetan semantic dependency analysis. Journal of Tsinghua University (Science and Technology) 59, 9 (2019), 750–756.Google ScholarGoogle Scholar
  54. Zou Xiaomei, Yang Jing, Zhang Jianpei, and Han Hongyu. 2018. Microblog sentiment analysis with weak dependency connections. Knowledge-Based Systems 142 (2018), 170–180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Xingxing Zhang, Jianpeng Cheng, and Mirella Lapata. 2016. Dependency parsing as head selection. arXiv:1606.01280Google ScholarGoogle Scholar

Index Terms

  1. Neural Dependency Parser for Tibetan Sentences

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 2
        March 2021
        313 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3454116
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 March 2021
        • Accepted: 1 October 2020
        • Revised: 1 July 2020
        • Received: 1 January 2020
        Published in tallip Volume 20, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format