Pictionary-Style Word Guessing on Hand-Drawn Object Sketches: Dataset, Analysis and Deep Network Models,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Pictionary-Style Word Guessing on Hand-Drawn Object Sketches: Dataset, Analysis and Deep Network Models
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 10-25-2018 , DOI: 10.1109/tpami.2018.2877996
Ravi Kiran Sarvadevabhatla , Shiv Surya , Trisha Mittal , R. Venkatesh Babu

The ability of intelligent agents to play games in human-like fashion is popularly considered a benchmark of progress in Artificial Intelligence. In our work, we introduce the first computational model aimed at Pictionary, the popular word-guessing social game. We first introduce Sketch-QA, a guessing task. Styled after Pictionary, Sketch-QA uses incrementally accumulated sketch stroke sequences as visual data. Sketch-QA involves asking a fixed question (“What object is being drawn?”) and gathering open-ended guess-words from human guessers. We analyze the resulting dataset and present many interesting findings therein. To mimic Pictionary-style guessing, we propose a deep neural model which generates guess-words in response to temporally evolving human-drawn object sketches. Our model even makes human-like mistakes while guessing, thus amplifying the human mimicry factor. We evaluate our model on the large-scale guess-word dataset generated via Sketch-QA task and compare with various baselines. We also conduct a Visual Turing Test to obtain human impressions of the guess-words generated by humans and our model. Experimental results demonstrate the promise of our approach for Pictionary and similarly themed games.

中文翻译：

手绘对象草图上的图画式猜词：数据集、分析和深度网络模型

智能代理以类人方式玩游戏的能力被普遍认为是人工智能进步的基准。在我们的工作中，我们引入了第一个针对Pictionary（流行的猜词社交游戏）的计算模型。我们首先介绍 Sketch-QA，一个猜测任务。 Sketch-QA 的风格仿照Pictionary，使用增量累积的草图笔画序列作为视觉数据。 Sketch-QA 涉及提出一个固定问题（“正在绘制什么对象？”）并从人类猜测者那里收集开放式猜测词。我们分析了生成的数据集并在其中提出了许多有趣的发现。为了模仿图画式的猜测，我们提出了一种深度神经模型，该模型可以根据随时间变化的人类绘制的对象草图生成猜测词。我们的模型甚至在猜测时会犯类似人类的错误，从而放大了人类的模仿因素。我们在通过 Sketch-QA 任务生成的大规模猜测词数据集上评估我们的模型，并与各种基线进行比较。我们还进行了视觉图灵测试，以获得人类对人类和我们的模型生成的猜测词的印象。实验结果证明了我们的方法对于图画游戏和类似主题游戏的前景。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11