当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving the Accessibility of Scientific Documents: Current State, User Needs, and a System Solution to Enhance Scientific PDF Accessibility for Blind and Low Vision Users
arXiv - CS - Digital Libraries Pub Date : 2021-04-30 , DOI: arxiv-2105.00076
Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie Yu-Yen Cheng, Chelsea Haupt, Matt Latzke, Bailey Kuehl, Madeleine van Zuylen, Linda Wagner, Daniel S. Weld

The majority of scientific papers are distributed in PDF, which pose challenges for accessibility, especially for blind and low vision (BLV) readers. We characterize the scope of this problem by assessing the accessibility of 11,397 PDFs published 2010--2019 sampled across various fields of study, finding that only 2.4% of these PDFs satisfy all of our defined accessibility criteria. We introduce the SciA11y system to offset some of the issues around inaccessibility. SciA11y incorporates several machine learning models to extract the content of scientific PDFs and render this content as accessible HTML, with added novel navigational features to support screen reader users. An intrinsic evaluation of extraction quality indicates that the majority of HTML renders (87%) produced by our system have no or only some readability issues. We perform a qualitative user study to understand the needs of BLV researchers when reading papers, and to assess whether the SciA11y system could address these needs. We summarize our user study findings into a set of five design recommendations for accessible scientific reader systems. User response to SciA11y was positive, with all users saying they would be likely to use the system in the future, and some stating that the system, if available, would become their primary workflow. We successfully produce HTML renders for over 12M papers, of which an open access subset of 1.5M are available for browsing at https://scia11y.org/

中文翻译:

改善科学文档的可访问性:当前状态,用户需求以及增强盲人和弱视用户对科学PDF的可访问性的系统解决方案

大多数科学论文都以PDF格式分发,这给可访问性带来了挑战,特别是对于盲人和弱视(BLV)读者而言。我们通过评估在各个研究领域中抽样的2010--2019年发布的11,397张PDF的可访问性来表征此问题的范围,发现只有2.4%的PDF满足我们定义的所有可访问性标准。我们引入SciA11y系统来解决一些无法访问的问题。SciA11y整合了多种机器学习模型,以提取科学PDF的内容并将其呈现为可访问的HTML,并添加了新颖的导航功能来支持屏幕阅读器用户。对提取质量的内在评估表明,由我们的系统生成的大多数HTML渲染(占87%)没有或仅有一些可读性问题。我们进行了定性的用户研究,以了解BLV研究人员在阅读论文时的需求,并评估SciA11y系统是否可以满足这些需求。我们将用户研究发现总结为针对可访问的科学阅读器系统的五项设计建议。用户对SciA11y的评价是积极的,所有用户都说他们将来可能会使用该系统,并且一些人指出,如果可用,该系统将成为他们的主要工作流程。我们成功地为超过1200万篇论文制作了HTML渲染,其中150万个开放访问子集可在https://scia11y.org/上浏览。我们将用户研究发现总结为针对可访问的科学阅读器系统的五项设计建议。用户对SciA11y的评价是积极的,所有用户都说他们将来可能会使用该系统,并且一些人指出,如果可用,该系统将成为他们的主要工作流程。我们成功地为超过1200万篇论文制作了HTML渲染,其中150万个开放访问子集可在https://scia11y.org/上浏览。我们将用户研究发现总结为针对可访问的科学阅读器系统的五项设计建议。用户对SciA11y的评价是积极的,所有用户都说他们将来可能会使用该系统,并且一些人指出,如果可用,该系统将成为他们的主要工作流程。我们成功地为超过1200万篇论文制作了HTML渲染,其中150万个开放访问子集可在https://scia11y.org/上浏览。
更新日期:2021-05-04
down
wechat
bug