PathVQA: 30000+ Questions for Medical Visual Question Answering,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PathVQA: 30000+ Questions for Medical Visual Question Answering
arXiv - CS - Computation and Language Pub Date : 2020-03-07 , DOI: arxiv-2003.10286
Xuehai He, Yichen Zhang, Luntian Mou, Eric Xing, Pengtao Xie

Is it possible to develop an "AI Pathologist" to pass the board-certified examination of the American Board of Pathology? To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer. Our work makes the first attempt to build such a dataset. Different from creating general-domain VQA datasets where the images are widely accessible and there are many crowdsourcing workers available and capable of generating question-answer pairs, developing a medical VQA dataset is much more challenging. First, due to privacy concerns, pathology images are usually not publicly available. Second, only well-trained pathologists can understand pathology images, but they barely have time to help create datasets for AI research. To address these challenges, we resort to pathology textbooks and online digital libraries. We develop a semi-automated pipeline to extract pathology images and captions from textbooks and generate question-answer pairs from captions using natural language processing. We collect 32,799 open-ended questions from 4,998 pathology images where each question is manually checked to ensure correctness. To our best knowledge, this is the first dataset for pathology VQA. Our dataset will be released publicly to promote research in medical VQA.

中文翻译：

PathVQA：30000+道医学视觉问答题

有没有可能培养出“AI病理学家”来通过美国病理学委员会的委员会认证考试？为了实现这一目标，第一步是创建一个视觉问答 (VQA) 数据集，其中向 AI 代理提供病理图像和问题，并要求其给出正确答案。我们的工作首次尝试构建这样的数据集。与创建图像可广泛访问且有许多众包工作人员可用并能够生成问答对的通用领域 VQA 数据集不同，开发医学 VQA 数据集更具挑战性。首先，由于隐私问题，病理图像通常不公开。其次，只有训练有素的病理学家才能理解病理图像，但他们几乎没有时间帮助创建用于 AI 研究的数据集。为了应对这些挑战，我们求助于病理学教科书和在线数字图书馆。我们开发了一个半自动管道来从教科书中提取病理图像和标题，并使用自然语言处理从标题中生成问答对。我们从 4,998 个病理图像中收集了 32,799 个开放式问题，其中每个问题都经过人工检查以确保正确性。据我们所知，这是病理学 VQA 的第一个数据集。我们的数据集将公开发布，以促进医学 VQA 的研究。我们开发了一个半自动管道来从教科书中提取病理图像和标题，并使用自然语言处理从标题中生成问答对。我们从 4,998 个病理图像中收集了 32,799 个开放式问题，其中每个问题都经过人工检查以确保正确性。据我们所知，这是病理学 VQA 的第一个数据集。我们的数据集将公开发布，以促进医学 VQA 的研究。我们开发了一个半自动管道来从教科书中提取病理图像和标题，并使用自然语言处理从标题中生成问答对。我们从 4,998 个病理图像中收集了 32,799 个开放式问题，其中每个问题都经过人工检查以确保正确性。据我们所知，这是病理学 VQA 的第一个数据集。我们的数据集将公开发布，以促进医学 VQA 的研究。

更新日期：2020-03-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文