skip to main content
research-article

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Authors Info & Claims
Published:09 June 2021Publication History
Skip Abstract Section

Abstract

Open Relation Extraction (ORE) plays a significant role in the field of Information Extraction. It breaks the limitation that traditional relation extraction must pre-define relational types in the annotated corpus and specific domains restrictions, to realize the goal of extracting entities and the relation between entities in the open domain. However, with the increase of sentence complexity, the precision and recall of Entity Relation Extraction will be significantly reduced. To solve this problem, we present an unsupervised Clause_CORE method based on Chinese grammar and dependency parsing features. Clause_CORE is used for complex sentences processing, including decomposing complex sentence and dynamically complementing sentence components, which can reduce sentences complexity and maintain the integrity of sentences at the same time. Then, we perform dependency parsing for complete sentences and implement open entity relation extraction based on the model constructed by Chinese grammar rules. The experimental results show that the performance of Clause_CORE method is better than that of other advanced Chinese ORE systems on Wikipedia and Sina news datasets, which proves the correctness and effectiveness of the method. The results on mixed datasets of news data and encyclopedia data prove the generalization and portability of the method.

References

  1. Nancy Chinchor and Elaine Marsch. 1998. MUC-7 Information extraction task definition. In Proceedings of the 7th Message Understanding Conference (MUC’98). 359–367.Google ScholarGoogle Scholar
  2. Jing Xu, Liang Gan, Lu Deng, Jing Wang, and Zhou Yan. 2015. Dependency parsing-based Chinese open relation extraction. In Proceedings of the 4th International Conference on Computer Science and Network Technology (ICCSNT’15). 552–556.Google ScholarGoogle Scholar
  3. Michele Banko, M. J. Cafarella, and Stephen Soderland. 2007. Open information extraction from the web. In Proceedings of the 16th International Joint Conference on Artifical Intelligence (IJCAI’07). 2670–2676. DOI:http://dx.doi.org/10.1145/1409360.1409378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jun Zhao, Kang Liu, Youguang Zhou, and Li Cai. 2011. Open information extraction. J. Chinese Info. Process. 25, 6 (2011). 98–111. DOI:http://dx.doi.org/10.3969/j.issn.1003-0077.2011.06.013.Google ScholarGoogle Scholar
  5. Mingyao Li and Jing Yang. 2016. Open Chinese entity relation extraction method based on dependency parsing. Comput. Eng. 42, 6 (2016), 201–207.Google ScholarGoogle Scholar
  6. Shanshan Zheng. 2013. Extraction of Open Domain Entity Relations based on Chinese Grammar Features. Ph.D. East China Normal University, Shanghai.Google ScholarGoogle Scholar
  7. Shengbin Jia, Maozhen Li, and Yang Xiang. 2018. Chinese open relation extraction and knowledge base establishment. ACM Trans. Asian Low-Resource Lang. Info. Process. 17, 3 (2018), 1–22. DOI:https://doi.org/10.1145/3162077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1535–1545. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Fei Wu and D. S. Weld. 2010. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). 118–127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Johannes Kirschnick, Holmer Hemsen, and Volker Markl. 2016. JEDI: Joint entity and relation detection using type inference. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 61–66.Google ScholarGoogle ScholarCross RefCross Ref
  11. Makoto Miwa and Mohit Bansal. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting on Association for Computational Linguistics (ACL’16). DOI:http://dx.doi.org/10.18653/v1/P16-1105 arxiv:1601.0770.Google ScholarGoogle ScholarCross RefCross Ref
  12. Jun Li, Guimin Huang, Jianheng Chen, and Yabing Wang. 2019. Dual CNN for relation extraction with knowledge-based attention and word embeddings. Comput. Intell. Neurosci. 2019. DOI:https://doi.org/10.1155/2019/6789520Google ScholarGoogle Scholar
  13. Yuan Li, Xiang Chen, Yanxiang Bao, Dongliang Guo, and Xiao Huang. 2019. Relation extraction of Chinese fundamentals of electric circuits textbook based on CNN. 2019. In Proceedings of the IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC'19). 277--281. DOI:10.1109/ITNEC.2019.8729144Google ScholarGoogle Scholar
  14. Andrea Moro, Hong Li, Sebastian Krause, Feiyu Xu, Roberto Navigli, and Hans Uszkoreit. 2013. Semantic rule filtering for web-scale relation extraction. In Proceedings of the International Semantic Web Conference. 347–362. DOI:http://dx.doi.org/10.1007/978-3-642-41335-3_22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lixin Gan, Changxuan Wan, Dexi Liu, and Jiang Tengjiao Zhong, Qing. 2016. Chinese named entity relation extraction based on syntactic and semantic features. J. Comput. Res. Dev. 53, 2 (2016), 284–302. DOI:http://dx.doi.org/10.7544/issn1000-1239.2016.20150842.Google ScholarGoogle Scholar
  16. Charte David, Charte Francisco, García Salvador, and Herrera Francisco. 2019. A snapshot on nonstandard supervised learning problems: Taxonomy, relationships, problem transformations and algorithm adaptations. Progr. Artific. Intell. 2019 8, 1 (2019), 1–14. DOI:10.1007/s13748-018-00167-7.Google ScholarGoogle Scholar
  17. Meilun Sheng. 2014. Relation Extraction from Complex Texts in Open Field. Ph.D. Shanghai Jiao Tong University, Shanghai.Google ScholarGoogle Scholar
  18. Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 28–36.Google ScholarGoogle Scholar
  19. Sungmin Yang, Yoo So Yeop Jeong, and Ok Ran Jeong. 2020. DeNERT-KG: Named entity and relation extraction model using DQN, Knowledge Graph, and BERT [J]. Appl. Sci. 10, 18 (2020), 6429. https://doi.org/10.3390/app10186429Google ScholarGoogle ScholarCross RefCross Ref
  20. Stanovsky Gabriel, Michael Julian, Zettlemoyer Luke, and Dagan Ido. 2018. Supervised open information extraction. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 885–895. DOI:10.18653/v1/N18-1081.Google ScholarGoogle Scholar
  21. Roy Arpita, Park Youngja, Lee Taesung, and Pan Shimei. 2019. Supervising unsupervised open information extraction models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 728–737. DOI:10.18653/v1/D19-1067.Google ScholarGoogle Scholar
  22. Trisedya Bayu Distiawan, Weikum Gerhard, Qi Jianzhong, and Zhang Rui. 2019. Neural relation extraction for knowledge base enrichment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’19). 229–240.Google ScholarGoogle Scholar
  23. Elsahar Hady, Demidova Elena, Gottschalk Simon, Gravier Christophe, and Laforest Frederique. 2017. Unsupervised open relation extraction. In Proceedings of the European Semantic Web Conference, 12–16.Google ScholarGoogle Scholar
  24. Angeli Gabor, Premkumar Melvin Johnson, Christopher D. Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 344–354.Google ScholarGoogle Scholar
  25. Ying He, Zhixu Li, Qing Yang, Zhigang Chen, An Liu, Lei Zhao, and Xiaofang Zhou. 2020. End-to-end relation extraction based on bootstrapped multi-level distant supervision. In Proceedings of the World Wide Web Conference. 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  26. Xinsong Zhang, Tianyi Liu, Weijia Jia, and Pengshuai Li. Fine-grained relation extraction with focal multi-task learning. Sci. China Info. Sci. 63, 6 (2020), 169103. https://doi.org/10.1007/s11432-018-9721-7Google ScholarGoogle Scholar
  27. Filipe Mesquita, Jordan Schmidek, and Denilson Barbosa. 2013. Effectiveness and efficiency of open relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 447–457.Google ScholarGoogle Scholar
  28. Yuval Merhav, Filipe de Sa Mesquita, Denilson Barbosa, Wai Gen Yee, and Ophir Frieder. 2012. Extracting information networks from the blogosphere. ACM Trans. Web 6, 3 (2012), 1–33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, and Oren Etzioni. 2012. Open language learning for information extraction. In Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). 523–534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ndapandula Nakashole, Gerhard Weikum, and Fabian Suchanek. 2012. PATTY: A taxonomy of relational patterns with semantic types. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLP’12). 1135–1145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ying Xu, Mi-Young Kim, Kevin Quinn, Randy Goebel, and Denilson Barbosa. 2013. Open information extraction with tree kernels. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 868–877.Google ScholarGoogle Scholar
  32. Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. 2003. Using predicate-argument structures for information extraction. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. 8–15. DOI:10.3115/1075096.1075098. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Richard Johansson and Pierre Nugues. 2008. Dependency-based semantic role labeling of propbank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’08). 69–78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Del Corro, Luciano, and Rainer Gemulla. 2013. Clausie: clause-based open information extraction. In Proceedings of the 22nd international conference on World Wide Web (WWW’13). 355–366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Duc-Thuan Vo and Ebrahim Bagheri. 2017. Self-training on refined clause patterns for relation extraction. Info. Process. Manage. 000 (2017), 1–21. DOI: https://doi.org/10.1016/j.ipm.2017.02.009.Google ScholarGoogle Scholar
  36. Reshadat Vahideh and Faili Heshaam. 2019. A new open information extraction system using sentence difficulty estimation. Comput. Info. 38, 1 (2019), 986–1008. DOI:10.31577/cai_2019_4_986.Google ScholarGoogle Scholar
  37. Hao Fei, Yafeng Ren, and Donghong Ji. 2020. Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction. Info. Process. Manage. 57, 6 (2020), 102311.Google ScholarGoogle ScholarCross RefCross Ref
  38. Likun Qiu and Yue Zhang. 2014. ZORE: A syntax-based system for Chinese open relation extraction, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1870–1880.Google ScholarGoogle ScholarCross RefCross Ref
  39. Y. H. Tseng, L. H. Lee, S. Y. Lin, B. S. Liao, M. J. Liu, H. H. Chen, Oren Etzioni, and Anthony Fader. 2014. Chinese open relation extraction for knowledge acquisition. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (ACL’14). 12–16.Google ScholarGoogle ScholarCross RefCross Ref
  40. Yue Wang, Gang Zhou, Fei Tian, Yu Nan, and Jiangtao Ma. 2015. GCORE: A gravitation-based approach for Chinese open relation. In Proceedings of the International Conference on Computer Science and Mechanical Automation (CSMA’15). 86–91. DOI:http://dx.doi.org/10.1109/CSMA.2015.24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Chen Huang, Longhua Qian, Guodong Zhou, and Qiaoming Zhu. 2010. Research on unsupervised Chinese entity relation extraction based on convolution tree kernel. J. Chinese Info. Process. 24, 4 (2010), 11–18.Google ScholarGoogle Scholar
  42. Fang Miao, Huixin Liu, Bo Miao, and Chenming Liu. 2018. Open domain news text relationship extraction based on dependency syntax. In Proceedings of the IEEE International Conference of Safety Produce Informatization (IICSPI'18), 310--314.Google ScholarGoogle ScholarCross RefCross Ref
  43. Rafael Glauber and B. C. Daniela. 2018. A systematic mapping study on open information extraction. Expert Syst. Appl. 112 (2018), 372–387. DOI:https://doi.org/10.1016/j.eswa.2018.06.046.Google ScholarGoogle ScholarCross RefCross Ref
  44. de Abreu Sandra Collovini and Renata Vieira. 2017. Relp: Portuguese open relation extraction. Knowl. Org. 44, 3 (2017), 163–177. DOI:https://doi.org/10.5771/0943-7444-2017-3-163.Google ScholarGoogle ScholarCross RefCross Ref
  45. Hailun Lin, Yuanzhuo Wang, Peng Zhang, Weiping Wang, Yinliang Yue, and Zhang Lin. 2016. A rule-based open information extraction method using cascaded finite-state transducer. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’16). 325–337. DOI: https://doi.org/10.1007/978-3-319-31750-2_26 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kim, Myung Hee, P. Compton, and Y. S. Kim. 2011. RDR-based open IE for the web document. In Proceedings of the 6th International Conference on Knowledge Capture (K-CAP’11). 105–112. DOI: https://doi.org/10.1145/1999676.1999696 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Xiaoyang Wu and Wu Bin. 2017. The CRFs-based Chinese open entity relation extraction. In Proceedings of the IEEE Second International Conference on Data Science in Cyberspace (DSC’17). 405–411. DOI:https://doi.org/10.1109/DSC.2017.40Google ScholarGoogle ScholarCross RefCross Ref
  48. Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, and Ping Li. 2018. Logician: A unified end-to-end neural approach for open-domain information extraction. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 9 pages. https://doi.org/10.1145/3159652.3159712 Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jiangying Zhang, Kuangrong Hao, Xue-song Tang, Xin Cai, Yan Xiao, and Tong Wang. 2020. A multi-feature fusion model for Chinese relation extraction with entity sense. Knowl.-Based Syst. 206, 106348 (2020), 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  50. Yuan Wang, Dezhi Xu, and Jianer Chen. 2009. Research on entity relationship extraction of complex Chinese texts. Comput. Sci. 36, 8 (2009), 208–211. DOI:http://dx.doi.org/10.3969/j.issn.1002-137X.2009.08.050Google ScholarGoogle Scholar
  51. Jiana Bao, Tingyu Li, and Tianfang Yao. 2012. Event information extraction approach based on complex Chinese texts. In Proceedings of the International Conference on Asian Language Processing (IALP'12). 61--64. DOI:http://dx.doi.org/10.1109/IALP.2012.37 Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Sally Mohamed Ali, Hamdy M. Mousa, Mahmoud Hussein. 2019. IJCI Int. J. Comput. Info. 6, 1 (2019), 20–28.DOI:10.21608/ijci.2019.35099Google ScholarGoogle Scholar
  53. Kiril Gashteovski, Rainer Gemulla, and L. D. Corro. 2017. Minie: Minimizing facts in open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2630–2640.Google ScholarGoogle Scholar
  54. Tan Saravadee Sae, Lim Tek Yong, Soon Lay-Ki, and Tang Enya Kong. 2016. Learning to extract domain-specific relations from complex sentences. Expert Syst. Appl. 60, 107–117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Petroni Fabio, Del Corro Luciano, and Gemulla Rainer. 2015. Core: Context-aware open relation extraction with factorization machines. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1763–1773. DOI:10.18653/v1/D15-1204Google ScholarGoogle Scholar
  56. Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. LTP: A Chinese language technology platform. In Proceedings of the 23rd International Wanxiang Che l Conference on Computational Linguistics: Demonstrations (COLING’10). 13–16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Jinshan Ma. 2008. Research on Chinese Dependency Parsing Based on Statistical Methods. Ph.D. Harbin Institute of Technology, Harbin.Google ScholarGoogle Scholar
  58. Maosong Sun and Changning Huang. 1989. Chinese concurrent words, homomorphic word groups and their processing strategies. J. Chinese Info. Process. 3, 4 (1989), 13–25.Google ScholarGoogle Scholar
  59. Jianjun Chen. 2010. A Study on Concurrent Word in two Dictionary Part of Speech Tagging. Ph.D. Nankai University.Google ScholarGoogle Scholar
  60. Yang Li. 2016. Research and Implementation of Chinese Open Entity Relation Extraction. Ph.D. University of Electronic Science and Technology of China.Google ScholarGoogle Scholar
  61. Bin Qin, Anan Liu, and Ting Liu. 2015. Unsupervised Chinese open entity relation extraction. J. Comput. Res. Dev. 52, 5 (2015), 1029–1035. DOI:http://dx.doi.org/10.7544/issn1000 1239.2015.20131550Google ScholarGoogle Scholar
  62. Yuzhao Wang, Yunfei Yang, and Ruixue Zhao. 2017. The Chinese open relation extraction based on dependency parsing. In Proceedings of the 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT’17). 1212–1216.Google ScholarGoogle ScholarCross RefCross Ref
  63. Shiyi Han, Yuhui Zhang, Yunshan Ma, Cunchao Tu, Zhipeng Guo, Zhiyuan Liu, and Maosong Sun. 2016. THUOCL: Tsinghua Open Chinese Lexicon, [Online]. http://thuocl.thunlp.org/.Google ScholarGoogle Scholar
  64. Xiaoyu Han, Yue Zhang, Wenkai Zhang, and Tinglei Huang. 2020. An attention-based model using character composition of entities in Chinese relation extraction[J]. Information 11, 2 (2020), 79.1--17.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 4
      July 2021
      419 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3465463
      Issue’s Table of Contents

      Copyright © 2020 Association for Computing Machinery.

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 June 2021
      • Accepted: 1 February 2021
      • Revised: 1 January 2021
      • Received: 1 September 2019
      Published in tallip Volume 20, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format