Can you hear me $\textit{now}$? Sensitive comparisons of human and machine perception,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Can you hear me $\textit{now}$? Sensitive comparisons of human and machine perception
arXiv - CS - Sound Pub Date : 2020-03-27 , DOI: arxiv-2003.12362
Michael A Lepori and Chaz Firestone

The rise of sophisticated machine-recognition systems has brought with it a rise in comparisons between human and machine perception. But such comparisons face an asymmetry: Whereas machine perception of some stimulus can often be probed through direct and explicit measures, much of human perceptual knowledge is latent, incomplete, or embedded in unconscious mental processes that may not be available for explicit report. Here, we show how this asymmetry can cause such comparisons to underestimate the overlap in human and machine perception. As a case study, we consider human perception of $\textit{adversarial speech}$ -- synthetic audio commands that are recognized as valid messages by automated speech-recognition systems but that human listeners reportedly hear as meaningless noise. In five experiments, we adapt task designs from the human psychophysics literature to show that even when subjects cannot freely transcribe adversarial speech (the previous benchmark for human understanding), they nevertheless $\textit{can}$ discriminate adversarial speech from closely matched non-speech (Experiments 1-2), finish common phrases begun in adversarial speech (Experiments 3-4), and solve simple math problems posed in adversarial speech (Experiment 5) -- even for stimuli previously described as "unintelligible to human listeners". We recommend the adoption of $\textit{sensitive tests}$ of human and machine perception, and discuss the broader consequences of this approach for comparing natural and artificial intelligence.

中文翻译：

你能听到我 $\textit{now}$ 的声音吗？人机感知的敏感比较

复杂的机器识别系统的兴起带来了人类和机器感知之间比较的增加。但是这样的比较面临着不对称：虽然机器对某些刺激的感知通常可以通过直接和明确的措施来探查，但人类的大部分感知知识是潜在的、不完整的，或者嵌入在无意识的心理过程中，可能无法进行明确的报告。在这里，我们展示了这种不对称性如何导致这种比较低估了人类和机器感知的重叠。作为一个案例研究，我们考虑了人类对 $\textit{adversarial Speech}$ 的感知——合成音频命令被自动语音识别系统识别为有效消息，但据报道人类听众听到的是无意义的噪音。在五个实验中，我们改编了人类心理物理学文献中的任务设计，以表明即使受试者无法自由转录对抗性语音（人类理解的先前基准），他们仍然可以将对抗性语音与密切匹配的非语音区分开来（实验 1 -2)，完成对抗性演讲中的常见短语（实验 3-4），并解决对抗性演讲中提出的简单数学问题（实验 5）——即使是之前描述为“人类听者无法理解”的刺激。我们建议采用人类和机器感知的 $\textit{敏感测试}$，并讨论这种方法在比较自然智能和人工智能方面的更广泛后果。完成对抗性演讲中开始的常见短语（实验 3-4），并解决对抗性演讲中提出的简单数学问题（实验 5）——即使是之前描述为“人类听者无法理解”的刺激。我们建议采用人类和机器感知的 $\textit{敏感测试}$，并讨论这种方法在比较自然智能和人工智能方面的更广泛后果。完成对抗性演讲中开始的常见短语（实验 3-4），并解决对抗性演讲中提出的简单数学问题（实验 5）——即使是之前描述为“人类听者无法理解”的刺激。我们建议采用人类和机器感知的 $\textit{敏感测试}$，并讨论这种方法在比较自然智能和人工智能方面的更广泛后果。

更新日期：2020-03-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文