research-article

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Authors:
Shanshan Qi

College of Information and Electrical Engineering, China Agricultural University, Beijing, Asian, China

College of Information and Electrical Engineering, China Agricultural University, Beijing, Asian, China
View Profile

,
Limin Zheng

College of Information and Electrical Engineering, China Agricultural University, Beijing, Asian, China

College of Information and Electrical Engineering, China Agricultural University, Beijing, Asian, China
View Profile

,
Feiyu Shang

College of Information and Electrical Engineering, China Agricultural University, Beijing, Asian, China

College of Information and Electrical Engineering, China Agricultural University, Beijing, Asian, China
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20 Issue 4Article No.: 67pp 1–34https://doi.org/10.1145/3450273

Published:09 June 2021Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Open Relation Extraction (ORE) plays a significant role in the field of Information Extraction. It breaks the limitation that traditional relation extraction must pre-define relational types in the annotated corpus and specific domains restrictions, to realize the goal of extracting entities and the relation between entities in the open domain. However, with the increase of sentence complexity, the precision and recall of Entity Relation Extraction will be significantly reduced. To solve this problem, we present an unsupervised Clause_CORE method based on Chinese grammar and dependency parsing features. Clause_CORE is used for complex sentences processing, including decomposing complex sentence and dynamically complementing sentence components, which can reduce sentences complexity and maintain the integrity of sentences at the same time. Then, we perform dependency parsing for complete sentences and implement open entity relation extraction based on the model constructed by Chinese grammar rules. The experimental results show that the performance of Clause_CORE method is better than that of other advanced Chinese ORE systems on Wikipedia and Sina news datasets, which proves the correctness and effectiveness of the method. The results on mixed datasets of news data and encyclopedia data prove the generalization and portability of the method.

References

Nancy Chinchor and Elaine Marsch. 1998. MUC-7 Information extraction task definition. In Proceedings of the 7th Message Understanding Conference (MUC’98). 359–367.Google Scholar
Jing Xu, Liang Gan, Lu Deng, Jing Wang, and Zhou Yan. 2015. Dependency parsing-based Chinese open relation extraction. In Proceedings of the 4th International Conference on Computer Science and Network Technology (ICCSNT’15). 552–556.Google Scholar
Michele Banko, M. J. Cafarella, and Stephen Soderland. 2007. Open information extraction from the web. In Proceedings of the 16th International Joint Conference on Artifical Intelligence (IJCAI’07). 2670–2676. DOI:http://dx.doi.org/10.1145/1409360.1409378. Google ScholarDigital Library
Jun Zhao, Kang Liu, Youguang Zhou, and Li Cai. 2011. Open information extraction. J. Chinese Info. Process. 25, 6 (2011). 98–111. DOI:http://dx.doi.org/10.3969/j.issn.1003-0077.2011.06.013.Google Scholar
Mingyao Li and Jing Yang. 2016. Open Chinese entity relation extraction method based on dependency parsing. Comput. Eng. 42, 6 (2016), 201–207.Google Scholar
Shanshan Zheng. 2013. Extraction of Open Domain Entity Relations based on Chinese Grammar Features. Ph.D. East China Normal University, Shanghai.Google Scholar
Shengbin Jia, Maozhen Li, and Yang Xiang. 2018. Chinese open relation extraction and knowledge base establishment. ACM Trans. Asian Low-Resource Lang. Info. Process. 17, 3 (2018), 1–22. DOI:https://doi.org/10.1145/3162077. Google ScholarDigital Library
Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1535–1545. Google ScholarDigital Library
Fei Wu and D. S. Weld. 2010. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). 118–127. Google ScholarDigital Library
Johannes Kirschnick, Holmer Hemsen, and Volker Markl. 2016. JEDI: Joint entity and relation detection using type inference. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 61–66.Google ScholarCross Ref
Makoto Miwa and Mohit Bansal. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting on Association for Computational Linguistics (ACL’16). DOI:http://dx.doi.org/10.18653/v1/P16-1105 arxiv:1601.0770.Google ScholarCross Ref
Jun Li, Guimin Huang, Jianheng Chen, and Yabing Wang. 2019. Dual CNN for relation extraction with knowledge-based attention and word embeddings. Comput. Intell. Neurosci. 2019. DOI:https://doi.org/10.1155/2019/6789520Google Scholar
Yuan Li, Xiang Chen, Yanxiang Bao, Dongliang Guo, and Xiao Huang. 2019. Relation extraction of Chinese fundamentals of electric circuits textbook based on CNN. 2019. In Proceedings of the IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC'19). 277--281. DOI:10.1109/ITNEC.2019.8729144Google Scholar
Andrea Moro, Hong Li, Sebastian Krause, Feiyu Xu, Roberto Navigli, and Hans Uszkoreit. 2013. Semantic rule filtering for web-scale relation extraction. In Proceedings of the International Semantic Web Conference. 347–362. DOI:http://dx.doi.org/10.1007/978-3-642-41335-3_22. Google ScholarDigital Library
Lixin Gan, Changxuan Wan, Dexi Liu, and Jiang Tengjiao Zhong, Qing. 2016. Chinese named entity relation extraction based on syntactic and semantic features. J. Comput. Res. Dev. 53, 2 (2016), 284–302. DOI:http://dx.doi.org/10.7544/issn1000-1239.2016.20150842.Google Scholar
Charte David, Charte Francisco, García Salvador, and Herrera Francisco. 2019. A snapshot on nonstandard supervised learning problems: Taxonomy, relationships, problem transformations and algorithm adaptations. Progr. Artific. Intell. 2019 8, 1 (2019), 1–14. DOI:10.1007/s13748-018-00167-7.Google Scholar
Meilun Sheng. 2014. Relation Extraction from Complex Texts in Open Field. Ph.D. Shanghai Jiao Tong University, Shanghai.Google Scholar
Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 28–36.Google Scholar
Sungmin Yang, Yoo So Yeop Jeong, and Ok Ran Jeong. 2020. DeNERT-KG: Named entity and relation extraction model using DQN, Knowledge Graph, and BERT [J]. Appl. Sci. 10, 18 (2020), 6429. https://doi.org/10.3390/app10186429Google ScholarCross Ref
Stanovsky Gabriel, Michael Julian, Zettlemoyer Luke, and Dagan Ido. 2018. Supervised open information extraction. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 885–895. DOI:10.18653/v1/N18-1081.Google Scholar
Roy Arpita, Park Youngja, Lee Taesung, and Pan Shimei. 2019. Supervising unsupervised open information extraction models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 728–737. DOI:10.18653/v1/D19-1067.Google Scholar
Trisedya Bayu Distiawan, Weikum Gerhard, Qi Jianzhong, and Zhang Rui. 2019. Neural relation extraction for knowledge base enrichment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’19). 229–240.Google Scholar
Elsahar Hady, Demidova Elena, Gottschalk Simon, Gravier Christophe, and Laforest Frederique. 2017. Unsupervised open relation extraction. In Proceedings of the European Semantic Web Conference, 12–16.Google Scholar
Angeli Gabor, Premkumar Melvin Johnson, Christopher D. Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 344–354.Google Scholar
Ying He, Zhixu Li, Qing Yang, Zhigang Chen, An Liu, Lei Zhao, and Xiaofang Zhou. 2020. End-to-end relation extraction based on bootstrapped multi-level distant supervision. In Proceedings of the World Wide Web Conference. 1--24.Google ScholarCross Ref
Xinsong Zhang, Tianyi Liu, Weijia Jia, and Pengshuai Li. Fine-grained relation extraction with focal multi-task learning. Sci. China Info. Sci. 63, 6 (2020), 169103. https://doi.org/10.1007/s11432-018-9721-7Google Scholar
Filipe Mesquita, Jordan Schmidek, and Denilson Barbosa. 2013. Effectiveness and efficiency of open relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 447–457.Google Scholar
Yuval Merhav, Filipe de Sa Mesquita, Denilson Barbosa, Wai Gen Yee, and Ophir Frieder. 2012. Extracting information networks from the blogosphere. ACM Trans. Web 6, 3 (2012), 1–33. Google ScholarDigital Library
Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, and Oren Etzioni. 2012. Open language learning for information extraction. In Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). 523–534. Google ScholarDigital Library
Ndapandula Nakashole, Gerhard Weikum, and Fabian Suchanek. 2012. PATTY: A taxonomy of relational patterns with semantic types. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLP’12). 1135–1145. Google ScholarDigital Library
Ying Xu, Mi-Young Kim, Kevin Quinn, Randy Goebel, and Denilson Barbosa. 2013. Open information extraction with tree kernels. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 868–877.Google Scholar
Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. 2003. Using predicate-argument structures for information extraction. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. 8–15. DOI:10.3115/1075096.1075098. Google ScholarDigital Library
Richard Johansson and Pierre Nugues. 2008. Dependency-based semantic role labeling of propbank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’08). 69–78. Google ScholarDigital Library
Del Corro, Luciano, and Rainer Gemulla. 2013. Clausie: clause-based open information extraction. In Proceedings of the 22nd international conference on World Wide Web (WWW’13). 355–366. Google ScholarDigital Library
Duc-Thuan Vo and Ebrahim Bagheri. 2017. Self-training on refined clause patterns for relation extraction. Info. Process. Manage. 000 (2017), 1–21. DOI: https://doi.org/10.1016/j.ipm.2017.02.009.Google Scholar
Reshadat Vahideh and Faili Heshaam. 2019. A new open information extraction system using sentence difficulty estimation. Comput. Info. 38, 1 (2019), 986–1008. DOI:10.31577/cai_2019_4_986.Google Scholar
Hao Fei, Yafeng Ren, and Donghong Ji. 2020. Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction. Info. Process. Manage. 57, 6 (2020), 102311.Google ScholarCross Ref
Likun Qiu and Yue Zhang. 2014. ZORE: A syntax-based system for Chinese open relation extraction, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1870–1880.Google ScholarCross Ref
Y. H. Tseng, L. H. Lee, S. Y. Lin, B. S. Liao, M. J. Liu, H. H. Chen, Oren Etzioni, and Anthony Fader. 2014. Chinese open relation extraction for knowledge acquisition. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (ACL’14). 12–16.Google ScholarCross Ref
Yue Wang, Gang Zhou, Fei Tian, Yu Nan, and Jiangtao Ma. 2015. GCORE: A gravitation-based approach for Chinese open relation. In Proceedings of the International Conference on Computer Science and Mechanical Automation (CSMA’15). 86–91. DOI:http://dx.doi.org/10.1109/CSMA.2015.24. Google ScholarDigital Library
Chen Huang, Longhua Qian, Guodong Zhou, and Qiaoming Zhu. 2010. Research on unsupervised Chinese entity relation extraction based on convolution tree kernel. J. Chinese Info. Process. 24, 4 (2010), 11–18.Google Scholar
Fang Miao, Huixin Liu, Bo Miao, and Chenming Liu. 2018. Open domain news text relationship extraction based on dependency syntax. In Proceedings of the IEEE International Conference of Safety Produce Informatization (IICSPI'18), 310--314.Google ScholarCross Ref
Rafael Glauber and B. C. Daniela. 2018. A systematic mapping study on open information extraction. Expert Syst. Appl. 112 (2018), 372–387. DOI:https://doi.org/10.1016/j.eswa.2018.06.046.Google ScholarCross Ref
de Abreu Sandra Collovini and Renata Vieira. 2017. Relp: Portuguese open relation extraction. Knowl. Org. 44, 3 (2017), 163–177. DOI:https://doi.org/10.5771/0943-7444-2017-3-163.Google ScholarCross Ref
Hailun Lin, Yuanzhuo Wang, Peng Zhang, Weiping Wang, Yinliang Yue, and Zhang Lin. 2016. A rule-based open information extraction method using cascaded finite-state transducer. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’16). 325–337. DOI: https://doi.org/10.1007/978-3-319-31750-2_26 Google ScholarDigital Library
Kim, Myung Hee, P. Compton, and Y. S. Kim. 2011. RDR-based open IE for the web document. In Proceedings of the 6th International Conference on Knowledge Capture (K-CAP’11). 105–112. DOI: https://doi.org/10.1145/1999676.1999696 Google ScholarDigital Library
Xiaoyang Wu and Wu Bin. 2017. The CRFs-based Chinese open entity relation extraction. In Proceedings of the IEEE Second International Conference on Data Science in Cyberspace (DSC’17). 405–411. DOI:https://doi.org/10.1109/DSC.2017.40Google ScholarCross Ref
Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, and Ping Li. 2018. Logician: A unified end-to-end neural approach for open-domain information extraction. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 9 pages. https://doi.org/10.1145/3159652.3159712 Google ScholarDigital Library
Jiangying Zhang, Kuangrong Hao, Xue-song Tang, Xin Cai, Yan Xiao, and Tong Wang. 2020. A multi-feature fusion model for Chinese relation extraction with entity sense. Knowl.-Based Syst. 206, 106348 (2020), 1--10.Google ScholarCross Ref
Yuan Wang, Dezhi Xu, and Jianer Chen. 2009. Research on entity relationship extraction of complex Chinese texts. Comput. Sci. 36, 8 (2009), 208–211. DOI:http://dx.doi.org/10.3969/j.issn.1002-137X.2009.08.050Google Scholar
Jiana Bao, Tingyu Li, and Tianfang Yao. 2012. Event information extraction approach based on complex Chinese texts. In Proceedings of the International Conference on Asian Language Processing (IALP'12). 61--64. DOI:http://dx.doi.org/10.1109/IALP.2012.37 Google ScholarDigital Library
Sally Mohamed Ali, Hamdy M. Mousa, Mahmoud Hussein. 2019. IJCI Int. J. Comput. Info. 6, 1 (2019), 20–28.DOI:10.21608/ijci.2019.35099Google Scholar
Kiril Gashteovski, Rainer Gemulla, and L. D. Corro. 2017. Minie: Minimizing facts in open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2630–2640.Google Scholar
Tan Saravadee Sae, Lim Tek Yong, Soon Lay-Ki, and Tang Enya Kong. 2016. Learning to extract domain-specific relations from complex sentences. Expert Syst. Appl. 60, 107–117. Google ScholarDigital Library
Petroni Fabio, Del Corro Luciano, and Gemulla Rainer. 2015. Core: Context-aware open relation extraction with factorization machines. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1763–1773. DOI:10.18653/v1/D15-1204Google Scholar
Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. LTP: A Chinese language technology platform. In Proceedings of the 23rd International Wanxiang Che l Conference on Computational Linguistics: Demonstrations (COLING’10). 13–16. Google ScholarDigital Library
Jinshan Ma. 2008. Research on Chinese Dependency Parsing Based on Statistical Methods. Ph.D. Harbin Institute of Technology, Harbin.Google Scholar
Maosong Sun and Changning Huang. 1989. Chinese concurrent words, homomorphic word groups and their processing strategies. J. Chinese Info. Process. 3, 4 (1989), 13–25.Google Scholar
Jianjun Chen. 2010. A Study on Concurrent Word in two Dictionary Part of Speech Tagging. Ph.D. Nankai University.Google Scholar
Yang Li. 2016. Research and Implementation of Chinese Open Entity Relation Extraction. Ph.D. University of Electronic Science and Technology of China.Google Scholar
Bin Qin, Anan Liu, and Ting Liu. 2015. Unsupervised Chinese open entity relation extraction. J. Comput. Res. Dev. 52, 5 (2015), 1029–1035. DOI:http://dx.doi.org/10.7544/issn1000 1239.2015.20131550Google Scholar
Yuzhao Wang, Yunfei Yang, and Ruixue Zhao. 2017. The Chinese open relation extraction based on dependency parsing. In Proceedings of the 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT’17). 1212–1216.Google ScholarCross Ref
Shiyi Han, Yuhui Zhang, Yunshan Ma, Cunchao Tu, Zhipeng Guo, Zhiyuan Liu, and Maosong Sun. 2016. THUOCL: Tsinghua Open Chinese Lexicon, [Online]. http://thuocl.thunlp.org/.Google Scholar
Xiaoyu Han, Yue Zhang, Wenkai Zhang, and Tinglei Huang. 2020. An attention-based model using character composition of entities in Chinese relation extraction[J]. Information 11, 2 (2020), 79.1--17.Google ScholarCross Ref

Index Terms

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Chinese Open Relation Extraction and Knowledge Base Establishment

Named entity relation extraction is an important subject in the field of information extraction. Although many English extractors have achieved reasonable performance, an effective system for Chinese relation extraction remains undeveloped due to the ...
Read More
Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags

We show that Combinatory Categorial Grammar (CCG) supertags can improve Telugu dependency parsing. In this process, we first extract a CCG lexicon from the dependency treebank. Using both the CCG lexicon and the dependency treebank, we create a CCG ...
Read More
Neural Character-Level Syntactic Parsing for Chinese
In this work, we explore character-level neural syntactic parsing for Chinese with two typical syntactic formalisms: the constituent formalism and a dependency formalism based on a newly released character-level dependency treebank. Prior works in Chinese ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20, Issue 4
July 2021
419 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3465463
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Copyright © 2020 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2021
- Accepted: 1 February 2021
- Revised: 1 January 2021
- Received: 1 September 2019
Published in tallip Volume 20, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Open entity relation extraction
dependency parsing
complex sentences processed
Chinese grammar rules
unsupervised
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 302
  Total Downloads
- Downloads (Last 12 months)40
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Chinese Open Relation Extraction and Knowledge Base Establishment

Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags

Neural Character-Level Syntactic Parsing for Chinese

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Chinese Open Relation Extraction and Knowledge Base Establishment

Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags

Neural Character-Level Syntactic Parsing for Chinese

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media