Abstract
Open Relation Extraction (ORE) plays a significant role in the field of Information Extraction. It breaks the limitation that traditional relation extraction must pre-define relational types in the annotated corpus and specific domains restrictions, to realize the goal of extracting entities and the relation between entities in the open domain. However, with the increase of sentence complexity, the precision and recall of Entity Relation Extraction will be significantly reduced. To solve this problem, we present an unsupervised Clause_CORE method based on Chinese grammar and dependency parsing features. Clause_CORE is used for complex sentences processing, including decomposing complex sentence and dynamically complementing sentence components, which can reduce sentences complexity and maintain the integrity of sentences at the same time. Then, we perform dependency parsing for complete sentences and implement open entity relation extraction based on the model constructed by Chinese grammar rules. The experimental results show that the performance of Clause_CORE method is better than that of other advanced Chinese ORE systems on Wikipedia and Sina news datasets, which proves the correctness and effectiveness of the method. The results on mixed datasets of news data and encyclopedia data prove the generalization and portability of the method.
- Nancy Chinchor and Elaine Marsch. 1998. MUC-7 Information extraction task definition. In Proceedings of the 7th Message Understanding Conference (MUC’98). 359–367.Google Scholar
- Jing Xu, Liang Gan, Lu Deng, Jing Wang, and Zhou Yan. 2015. Dependency parsing-based Chinese open relation extraction. In Proceedings of the 4th International Conference on Computer Science and Network Technology (ICCSNT’15). 552–556.Google Scholar
- Michele Banko, M. J. Cafarella, and Stephen Soderland. 2007. Open information extraction from the web. In Proceedings of the 16th International Joint Conference on Artifical Intelligence (IJCAI’07). 2670–2676. DOI:http://dx.doi.org/10.1145/1409360.1409378. Google ScholarDigital Library
- Jun Zhao, Kang Liu, Youguang Zhou, and Li Cai. 2011. Open information extraction. J. Chinese Info. Process. 25, 6 (2011). 98–111. DOI:http://dx.doi.org/10.3969/j.issn.1003-0077.2011.06.013.Google Scholar
- Mingyao Li and Jing Yang. 2016. Open Chinese entity relation extraction method based on dependency parsing. Comput. Eng. 42, 6 (2016), 201–207.Google Scholar
- Shanshan Zheng. 2013. Extraction of Open Domain Entity Relations based on Chinese Grammar Features. Ph.D. East China Normal University, Shanghai.Google Scholar
- Shengbin Jia, Maozhen Li, and Yang Xiang. 2018. Chinese open relation extraction and knowledge base establishment. ACM Trans. Asian Low-Resource Lang. Info. Process. 17, 3 (2018), 1–22. DOI:https://doi.org/10.1145/3162077. Google ScholarDigital Library
- Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1535–1545. Google ScholarDigital Library
- Fei Wu and D. S. Weld. 2010. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). 118–127. Google ScholarDigital Library
- Johannes Kirschnick, Holmer Hemsen, and Volker Markl. 2016. JEDI: Joint entity and relation detection using type inference. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 61–66.Google ScholarCross Ref
- Makoto Miwa and Mohit Bansal. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting on Association for Computational Linguistics (ACL’16). DOI:http://dx.doi.org/10.18653/v1/P16-1105 arxiv:1601.0770.Google ScholarCross Ref
- Jun Li, Guimin Huang, Jianheng Chen, and Yabing Wang. 2019. Dual CNN for relation extraction with knowledge-based attention and word embeddings. Comput. Intell. Neurosci. 2019. DOI:https://doi.org/10.1155/2019/6789520Google Scholar
- Yuan Li, Xiang Chen, Yanxiang Bao, Dongliang Guo, and Xiao Huang. 2019. Relation extraction of Chinese fundamentals of electric circuits textbook based on CNN. 2019. In Proceedings of the IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC'19). 277--281. DOI:10.1109/ITNEC.2019.8729144Google Scholar
- Andrea Moro, Hong Li, Sebastian Krause, Feiyu Xu, Roberto Navigli, and Hans Uszkoreit. 2013. Semantic rule filtering for web-scale relation extraction. In Proceedings of the International Semantic Web Conference. 347–362. DOI:http://dx.doi.org/10.1007/978-3-642-41335-3_22. Google ScholarDigital Library
- Lixin Gan, Changxuan Wan, Dexi Liu, and Jiang Tengjiao Zhong, Qing. 2016. Chinese named entity relation extraction based on syntactic and semantic features. J. Comput. Res. Dev. 53, 2 (2016), 284–302. DOI:http://dx.doi.org/10.7544/issn1000-1239.2016.20150842.Google Scholar
- Charte David, Charte Francisco, García Salvador, and Herrera Francisco. 2019. A snapshot on nonstandard supervised learning problems: Taxonomy, relationships, problem transformations and algorithm adaptations. Progr. Artific. Intell. 2019 8, 1 (2019), 1–14. DOI:10.1007/s13748-018-00167-7.Google Scholar
- Meilun Sheng. 2014. Relation Extraction from Complex Texts in Open Field. Ph.D. Shanghai Jiao Tong University, Shanghai.Google Scholar
- Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 28–36.Google Scholar
- Sungmin Yang, Yoo So Yeop Jeong, and Ok Ran Jeong. 2020. DeNERT-KG: Named entity and relation extraction model using DQN, Knowledge Graph, and BERT [J]. Appl. Sci. 10, 18 (2020), 6429. https://doi.org/10.3390/app10186429Google ScholarCross Ref
- Stanovsky Gabriel, Michael Julian, Zettlemoyer Luke, and Dagan Ido. 2018. Supervised open information extraction. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 885–895. DOI:10.18653/v1/N18-1081.Google Scholar
- Roy Arpita, Park Youngja, Lee Taesung, and Pan Shimei. 2019. Supervising unsupervised open information extraction models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 728–737. DOI:10.18653/v1/D19-1067.Google Scholar
- Trisedya Bayu Distiawan, Weikum Gerhard, Qi Jianzhong, and Zhang Rui. 2019. Neural relation extraction for knowledge base enrichment. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’19). 229–240.Google Scholar
- Elsahar Hady, Demidova Elena, Gottschalk Simon, Gravier Christophe, and Laforest Frederique. 2017. Unsupervised open relation extraction. In Proceedings of the European Semantic Web Conference, 12–16.Google Scholar
- Angeli Gabor, Premkumar Melvin Johnson, Christopher D. Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 344–354.Google Scholar
- Ying He, Zhixu Li, Qing Yang, Zhigang Chen, An Liu, Lei Zhao, and Xiaofang Zhou. 2020. End-to-end relation extraction based on bootstrapped multi-level distant supervision. In Proceedings of the World Wide Web Conference. 1--24.Google ScholarCross Ref
- Xinsong Zhang, Tianyi Liu, Weijia Jia, and Pengshuai Li. Fine-grained relation extraction with focal multi-task learning. Sci. China Info. Sci. 63, 6 (2020), 169103. https://doi.org/10.1007/s11432-018-9721-7Google Scholar
- Filipe Mesquita, Jordan Schmidek, and Denilson Barbosa. 2013. Effectiveness and efficiency of open relation extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 447–457.Google Scholar
- Yuval Merhav, Filipe de Sa Mesquita, Denilson Barbosa, Wai Gen Yee, and Ophir Frieder. 2012. Extracting information networks from the blogosphere. ACM Trans. Web 6, 3 (2012), 1–33. Google ScholarDigital Library
- Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, and Oren Etzioni. 2012. Open language learning for information extraction. In Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). 523–534. Google ScholarDigital Library
- Ndapandula Nakashole, Gerhard Weikum, and Fabian Suchanek. 2012. PATTY: A taxonomy of relational patterns with semantic types. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLP’12). 1135–1145. Google ScholarDigital Library
- Ying Xu, Mi-Young Kim, Kevin Quinn, Randy Goebel, and Denilson Barbosa. 2013. Open information extraction with tree kernels. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 868–877.Google Scholar
- Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. 2003. Using predicate-argument structures for information extraction. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. 8–15. DOI:10.3115/1075096.1075098. Google ScholarDigital Library
- Richard Johansson and Pierre Nugues. 2008. Dependency-based semantic role labeling of propbank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’08). 69–78. Google ScholarDigital Library
- Del Corro, Luciano, and Rainer Gemulla. 2013. Clausie: clause-based open information extraction. In Proceedings of the 22nd international conference on World Wide Web (WWW’13). 355–366. Google ScholarDigital Library
- Duc-Thuan Vo and Ebrahim Bagheri. 2017. Self-training on refined clause patterns for relation extraction. Info. Process. Manage. 000 (2017), 1–21. DOI: https://doi.org/10.1016/j.ipm.2017.02.009.Google Scholar
- Reshadat Vahideh and Faili Heshaam. 2019. A new open information extraction system using sentence difficulty estimation. Comput. Info. 38, 1 (2019), 986–1008. DOI:10.31577/cai_2019_4_986.Google Scholar
- Hao Fei, Yafeng Ren, and Donghong Ji. 2020. Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction. Info. Process. Manage. 57, 6 (2020), 102311.Google ScholarCross Ref
- Likun Qiu and Yue Zhang. 2014. ZORE: A syntax-based system for Chinese open relation extraction, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1870–1880.Google ScholarCross Ref
- Y. H. Tseng, L. H. Lee, S. Y. Lin, B. S. Liao, M. J. Liu, H. H. Chen, Oren Etzioni, and Anthony Fader. 2014. Chinese open relation extraction for knowledge acquisition. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (ACL’14). 12–16.Google ScholarCross Ref
- Yue Wang, Gang Zhou, Fei Tian, Yu Nan, and Jiangtao Ma. 2015. GCORE: A gravitation-based approach for Chinese open relation. In Proceedings of the International Conference on Computer Science and Mechanical Automation (CSMA’15). 86–91. DOI:http://dx.doi.org/10.1109/CSMA.2015.24. Google ScholarDigital Library
- Chen Huang, Longhua Qian, Guodong Zhou, and Qiaoming Zhu. 2010. Research on unsupervised Chinese entity relation extraction based on convolution tree kernel. J. Chinese Info. Process. 24, 4 (2010), 11–18.Google Scholar
- Fang Miao, Huixin Liu, Bo Miao, and Chenming Liu. 2018. Open domain news text relationship extraction based on dependency syntax. In Proceedings of the IEEE International Conference of Safety Produce Informatization (IICSPI'18), 310--314.Google ScholarCross Ref
- Rafael Glauber and B. C. Daniela. 2018. A systematic mapping study on open information extraction. Expert Syst. Appl. 112 (2018), 372–387. DOI:https://doi.org/10.1016/j.eswa.2018.06.046.Google ScholarCross Ref
- de Abreu Sandra Collovini and Renata Vieira. 2017. Relp: Portuguese open relation extraction. Knowl. Org. 44, 3 (2017), 163–177. DOI:https://doi.org/10.5771/0943-7444-2017-3-163.Google ScholarCross Ref
- Hailun Lin, Yuanzhuo Wang, Peng Zhang, Weiping Wang, Yinliang Yue, and Zhang Lin. 2016. A rule-based open information extraction method using cascaded finite-state transducer. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’16). 325–337. DOI: https://doi.org/10.1007/978-3-319-31750-2_26 Google ScholarDigital Library
- Kim, Myung Hee, P. Compton, and Y. S. Kim. 2011. RDR-based open IE for the web document. In Proceedings of the 6th International Conference on Knowledge Capture (K-CAP’11). 105–112. DOI: https://doi.org/10.1145/1999676.1999696 Google ScholarDigital Library
- Xiaoyang Wu and Wu Bin. 2017. The CRFs-based Chinese open entity relation extraction. In Proceedings of the IEEE Second International Conference on Data Science in Cyberspace (DSC’17). 405–411. DOI:https://doi.org/10.1109/DSC.2017.40Google ScholarCross Ref
- Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, and Ping Li. 2018. Logician: A unified end-to-end neural approach for open-domain information extraction. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 9 pages. https://doi.org/10.1145/3159652.3159712 Google ScholarDigital Library
- Jiangying Zhang, Kuangrong Hao, Xue-song Tang, Xin Cai, Yan Xiao, and Tong Wang. 2020. A multi-feature fusion model for Chinese relation extraction with entity sense. Knowl.-Based Syst. 206, 106348 (2020), 1--10.Google ScholarCross Ref
- Yuan Wang, Dezhi Xu, and Jianer Chen. 2009. Research on entity relationship extraction of complex Chinese texts. Comput. Sci. 36, 8 (2009), 208–211. DOI:http://dx.doi.org/10.3969/j.issn.1002-137X.2009.08.050Google Scholar
- Jiana Bao, Tingyu Li, and Tianfang Yao. 2012. Event information extraction approach based on complex Chinese texts. In Proceedings of the International Conference on Asian Language Processing (IALP'12). 61--64. DOI:http://dx.doi.org/10.1109/IALP.2012.37 Google ScholarDigital Library
- Sally Mohamed Ali, Hamdy M. Mousa, Mahmoud Hussein. 2019. IJCI Int. J. Comput. Info. 6, 1 (2019), 20–28.DOI:10.21608/ijci.2019.35099Google Scholar
- Kiril Gashteovski, Rainer Gemulla, and L. D. Corro. 2017. Minie: Minimizing facts in open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2630–2640.Google Scholar
- Tan Saravadee Sae, Lim Tek Yong, Soon Lay-Ki, and Tang Enya Kong. 2016. Learning to extract domain-specific relations from complex sentences. Expert Syst. Appl. 60, 107–117. Google ScholarDigital Library
- Petroni Fabio, Del Corro Luciano, and Gemulla Rainer. 2015. Core: Context-aware open relation extraction with factorization machines. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1763–1773. DOI:10.18653/v1/D15-1204Google Scholar
- Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. LTP: A Chinese language technology platform. In Proceedings of the 23rd International Wanxiang Che l Conference on Computational Linguistics: Demonstrations (COLING’10). 13–16. Google ScholarDigital Library
- Jinshan Ma. 2008. Research on Chinese Dependency Parsing Based on Statistical Methods. Ph.D. Harbin Institute of Technology, Harbin.Google Scholar
- Maosong Sun and Changning Huang. 1989. Chinese concurrent words, homomorphic word groups and their processing strategies. J. Chinese Info. Process. 3, 4 (1989), 13–25.Google Scholar
- Jianjun Chen. 2010. A Study on Concurrent Word in two Dictionary Part of Speech Tagging. Ph.D. Nankai University.Google Scholar
- Yang Li. 2016. Research and Implementation of Chinese Open Entity Relation Extraction. Ph.D. University of Electronic Science and Technology of China.Google Scholar
- Bin Qin, Anan Liu, and Ting Liu. 2015. Unsupervised Chinese open entity relation extraction. J. Comput. Res. Dev. 52, 5 (2015), 1029–1035. DOI:http://dx.doi.org/10.7544/issn1000 1239.2015.20131550Google Scholar
- Yuzhao Wang, Yunfei Yang, and Ruixue Zhao. 2017. The Chinese open relation extraction based on dependency parsing. In Proceedings of the 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT’17). 1212–1216.Google ScholarCross Ref
- Shiyi Han, Yuhui Zhang, Yunshan Ma, Cunchao Tu, Zhipeng Guo, Zhiyuan Liu, and Maosong Sun. 2016. THUOCL: Tsinghua Open Chinese Lexicon, [Online]. http://thuocl.thunlp.org/.Google Scholar
- Xiaoyu Han, Yue Zhang, Wenkai Zhang, and Tinglei Huang. 2020. An attention-based model using character composition of entities in Chinese relation extraction[J]. Information 11, 2 (2020), 79.1--17.Google ScholarCross Ref
Index Terms
- Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text
Recommendations
Chinese Open Relation Extraction and Knowledge Base Establishment
Named entity relation extraction is an important subject in the field of information extraction. Although many English extractors have achieved reasonable performance, an effective system for Chinese relation extraction remains undeveloped due to the ...
Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags
We show that Combinatory Categorial Grammar (CCG) supertags can improve Telugu dependency parsing. In this process, we first extract a CCG lexicon from the dependency treebank. Using both the CCG lexicon and the dependency treebank, we create a CCG ...
Neural Character-Level Syntactic Parsing for Chinese
In this work, we explore character-level neural syntactic parsing for Chinese with two typical syntactic formalisms: the constituent formalism and a dependency formalism based on a newly released character-level dependency treebank. Prior works in Chinese ...
Comments