当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data
Information Sciences Pub Date : 2021-05-26 , DOI: 10.1016/j.ins.2021.05.045
Dandan Song , Jing Xu , Jinhui Pang , Heyan Huang

Fundamental information extraction tasks, such as relation extraction and event detection, suffer from a data imbalance problem. To alleviate this problem, existing methods rely mostly on well-designed loss functions to reduce the negative influence of imbalanced data. However, this approach requires additional hyper-parameters and limits scalability. Furthermore, these methods can only benefit specific tasks and do not provide a unified framework across relation extraction and event detection. In this paper, a Classifier-Adaptation Knowledge Distillation (CAKD) framework is proposed to address these issues, thus improving relation extraction and event detection performance. The first step is to exploit sentence-level identification information across relation extraction and event detection, which can reduce identification errors caused by the data imbalance problem without relying on additional hyper-parameters. Moreover, this sentence-level identification information is used by a teacher network to guide the baseline model’s training by sharing its classifier. Like an instructor, the classifier improves the baseline model’s ability to extract this sentence-level identification information from raw texts, thus benefiting overall performance. Experiments were conducted on both relation extraction and event detection using the Text Analysis Conference Relation Extraction Dataset (TACRED) and Automatic Content Extraction (ACE) 2005 English datasets, respectively. The results demonstrate the effectiveness of the proposed framework.



中文翻译:

用于具有不平衡数据的关系提取和事件检测的分类器自适应知识蒸馏框架

关系提取和事件检测等基础信息提取任务存在数据不平衡问题。为了缓解这个问题,现有方法主要依靠精心设计的损失函数来减少不平衡数据的负面影响。然而,这种方法需要额外的超参数并限制了可扩展性。此外,这些方法只能使特定任务受益,并且不能提供跨关系提取和事件检测的统一框架。在本文中,提出了一种分类器适应知识蒸馏(CAKD)框架来解决这些问题,从而提高关系提取和事件检测性能。第一步是利用跨关系提取和事件检测的句子级识别信息,可以减少数据不平衡问题导致的识别错误,而无需依赖额外的超参数。此外,教师网络使用此句子级识别信息通过共享其分类器来指导基线模型的训练。与讲师一样,分类器提高了基线模型从原始文本中提取句子级识别信息的能力,从而有利于整体性能。分别使用文本分析会议关系提取数据集 (TACRED) 和自动内容提取 (ACE) 2005 英语数据集对关系提取和事件检测进行了实验。结果证明了所提出框架的有效性。教师网络使用此句子级识别信息通过共享其分类器来指导基线模型的训练。与讲师一样,分类器提高了基线模型从原始文本中提取句子级识别信息的能力,从而有利于整体性能。分别使用文本分析会议关系提取数据集 (TACRED) 和自动内容提取 (ACE) 2005 英语数据集对关系提取和事件检测进行了实验。结果证明了所提出框架的有效性。教师网络使用此句子级识别信息通过共享其分类器来指导基线模型的训练。与讲师一样,分类器提高了基线模型从原始文本中提取句子级识别信息的能力,从而有利于整体性能。分别使用文本分析会议关系提取数据集 (TACRED) 和自动内容提取 (ACE) 2005 英语数据集对关系提取和事件检测进行了实验。结果证明了所提出框架的有效性。分类器提高了基线模型从原始文本中提取句子级识别信息的能力,从而有利于整体性能。分别使用文本分析会议关系提取数据集 (TACRED) 和自动内容提取 (ACE) 2005 英语数据集对关系提取和事件检测进行了实验。结果证明了所提出框架的有效性。分类器提高了基线模型从原始文本中提取句子级识别信息的能力,从而有利于整体性能。分别使用文本分析会议关系提取数据集 (TACRED) 和自动内容提取 (ACE) 2005 英语数据集对关系提取和事件检测进行了实验。结果证明了所提出框架的有效性。

更新日期:2021-06-11
down
wechat
bug