当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis
arXiv - CS - Computation and Language Pub Date : 2020-11-24 , DOI: arxiv-2011.12073
Michael A. Lepori, R. Thomas McCoy

As the name implies, contextualized representations of language are typically motivated by their ability to encode context. Which aspects of context are captured by such representations? We introduce an approach to address this question using Representational Similarity Analysis (RSA). As case studies, we investigate the degree to which a verb embedding encodes the verb's subject, a pronoun embedding encodes the pronoun's antecedent, and a full-sentence representation encodes the sentence's head word (as determined by a dependency parse). In all cases, we show that BERT's contextualized embeddings reflect the linguistic dependency being studied, and that BERT encodes these dependencies to a greater degree than it encodes less linguistically-salient controls. These results demonstrate the ability of our approach to adjudicate between hypotheses about which aspects of context are encoded in representations of language.

中文翻译:

挑拨BERT的大脑:使用表示相似性分析探索上下文化嵌入中的语言依赖性

顾名思义,语言的上下文表示通常是由它们对上下文进行编码的能力所驱动的。此类表示可捕获上下文的哪些方面?我们介绍一种使用代表性相似性分析(RSA)解决此问题的方法。作为案例研究,我们调查动词嵌入对动词的主语进行编码的程度,代词嵌入对代词的先行词进行编码以及全句表示对句子的主词进行编码(由依存关系解析确定)的程度。在所有情况下,我们都表明BERT的上下文嵌入体现了正在研究的语言依赖性,并且BERT对这些依赖性的编码程度大于对较少语言突出的控件的编码程度。
更新日期:2020-11-25
down
wechat
bug