当前位置: X-MOL 学术Neural Process Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multimodal Machine Learning for Natural Language Processing: Disambiguating Prepositional Phrase Attachments with Images
Neural Processing Letters ( IF 2.6 ) Pub Date : 2020-07-29 , DOI: 10.1007/s11063-020-10314-8
Sebastien Delecraz , Leonor Becerra-Bonache , Benoit Favre , Alexis Nasr , Frederic Bechet

Although documents are increasingly multimodal, their automatic processing is often monomodal. In particular, natural language processing tasks are typically performed based on the textual modality only. This work extends the syntactic parsing task to the image modality in addition to text. In particular, we address the prepositional phrase attachment problem, a hard and semantic problem for syntactic parsers. Given an image and a caption, the proposed approach resolves syntactic attachment of prepositions in the parse tree according to both visual and lexical features. Visual features are derived from the nature and position of detected objects in the image that are aligned to textual phrases in the caption. A reranker uses this information to reorder syntactic trees produced by a shift-reduce syntactic parser. Trained on the Flickr-PP corpus which contains multimodal gold-standard attachments, this approach yields improvements over a text-only syntactic parser, in particular for the subset of prepositions that encode location, leading to an increase of up to 17 points of attachment accuracy.



中文翻译:

用于自然语言处理的多模式机器学习:使用图像消除介词短语附件的歧义

尽管文档越来越多峰,但它们的自动处理通常是单峰的。特别地,自然语言处理任务通常仅基于文本形式来执行。这项工作将语法分析任务扩展到文本之外的图像形式。特别是,我们解决介词短语附着问题,这是句法分析器的一个困难且语义上的问题。给定一个图像和一个标题,提出的方法根据视觉和词汇特征来解析介词在语法树中的句法依附。视觉特征是根据图像中检测到的对象的性质和位置得出的,这些对象与标题中的文字短语对齐。重新排序器使用此信息来对移位减少语法分析器产生的语法树进行重新排序。

更新日期:2020-07-29
down
wechat
bug