当前位置: X-MOL 学术Comput. Vis. Image Underst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Image retrieval with mixed initiative and multimodal feedback
Computer Vision and Image Understanding ( IF 4.5 ) Pub Date : 2021-03-26 , DOI: 10.1016/j.cviu.2021.103204
Nils Murrugarra-Llerena , Adriana Kovashka

How would you search for a unique, flamboyant shoe that a friend wore and you want to buy? What if you did not take a picture? Existing approaches propose interactive image search, but they either entrust the user with taking the initiative to provide informative feedback, or give all control to the system which determines informative questions to ask. Instead, we propose a mixed-initiative framework where both the user and system can be active participants, depending on whose input will be more beneficial for obtaining high-quality search results. We develop a reinforcement learning approach which dynamically decides which of four interaction opportunities to give to the user: drawing a sketch, marking images as relevant or not, providing free-form attribute feedback, or answering attribute-based questions. By allowing these four options, our system optimizes both the informativeness of feedback, and the ability of the user to explore the data, allowing faster image retrieval. We outperform five baselines on three datasets under extensive settings.



中文翻译:

混合主动和多模式反馈的图像检索

您将如何寻找朋友穿的,想要购买的独特,艳丽的鞋子?如果您不拍照怎么办?现有的方法提出了交互式图像搜索,但是它们要么委托用户主动提供信息反馈,要么将所有控制权交给确定要询问的信息问题的系统。相反,我们提出了一个混合计划用户和系统都可以是积极参与者的框架,具体取决于谁的输入将更有利于获得高质量的搜索结果。我们开发了一种强化学习方法,该方法可以动态地决定要提供给用户的四个交互机会中的哪一个:绘制草图,将图像标记为相关或不相关,提供自由格式的属性反馈或回答基于属性的问题。通过允许这四个选项,我们的系统既可以优化反馈的信息性,又可以优化用户浏览数据的能力,从而可以更快地检索图像。在广泛设置下,我们在三个数据集上的表现优于五个基线。

更新日期:2021-04-13
down
wechat
bug