当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SUPERB: Speech processing Universal PERformance Benchmark
arXiv - CS - Sound Pub Date : 2021-05-03 , DOI: arxiv-2105.01051 Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee
arXiv - CS - Sound Pub Date : 2021-05-03 , DOI: arxiv-2105.01051 Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee
Self-supervised learning (SSL) has proven vital for advancing research in
natural language processing (NLP) and computer vision (CV). The paradigm
pretrains a shared model on large volumes of unlabeled data and achieves
state-of-the-art (SOTA) for various tasks with minimal adaptation. However, the
speech processing community lacks a similar setup to systematically explore the
paradigm. To bridge this gap, we introduce Speech processing Universal
PERformance Benchmark (SUPERB). SUPERB is a leaderboard to benchmark the
performance of a shared model across a wide range of speech processing tasks
with minimal architecture changes and labeled data. Among multiple usages of
the shared model, we especially focus on extracting the representation learned
from SSL due to its preferable re-usability. We present a simple framework to
solve SUPERB tasks by learning task-specialized lightweight prediction heads on
top of the frozen shared model. Our results demonstrate that the framework is
promising as SSL representations show competitive generalizability and
accessibility across SUPERB tasks. We release SUPERB as a challenge with a
leaderboard and a benchmark toolkit to fuel the research in representation
learning and general speech processing.
中文翻译:
上等:语音处理通用性能基准
事实证明,自我监督学习(SSL)对于推进自然语言处理(NLP)和计算机视觉(CV)的研究至关重要。该范例在大量未标记数据上预训练了共享模型,并以最小的适应性实现了各种任务的最新技术(SOTA)。但是,语音处理社区缺乏类似的设置来系统地探索范例。为了弥合这一差距,我们引入了语音处理通用性能基准(SUPERB)。SUPERB是排行榜的佼佼者,它以最小的体系结构更改和标记数据对各种语音处理任务中的共享模型的性能进行了基准测试。在共享模型的多种用法中,由于其较好的可重用性,我们尤其着重于提取从SSL中学到的表示形式。我们提出了一个简单的框架,通过在冻结的共享模型之上学习任务专用的轻量级预测头来解决SUPERB任务。我们的结果表明,该框架很有希望,因为SSL表示形式可以跨SUPERB任务显示出具有竞争力的可推广性和可访问性。我们以排行榜和基准工具包的形式发布SUPERB,以应对挑战,以推动代表性学习和通用语音处理方面的研究。
更新日期:2021-05-04
中文翻译:
上等:语音处理通用性能基准
事实证明,自我监督学习(SSL)对于推进自然语言处理(NLP)和计算机视觉(CV)的研究至关重要。该范例在大量未标记数据上预训练了共享模型,并以最小的适应性实现了各种任务的最新技术(SOTA)。但是,语音处理社区缺乏类似的设置来系统地探索范例。为了弥合这一差距,我们引入了语音处理通用性能基准(SUPERB)。SUPERB是排行榜的佼佼者,它以最小的体系结构更改和标记数据对各种语音处理任务中的共享模型的性能进行了基准测试。在共享模型的多种用法中,由于其较好的可重用性,我们尤其着重于提取从SSL中学到的表示形式。我们提出了一个简单的框架,通过在冻结的共享模型之上学习任务专用的轻量级预测头来解决SUPERB任务。我们的结果表明,该框架很有希望,因为SSL表示形式可以跨SUPERB任务显示出具有竞争力的可推广性和可访问性。我们以排行榜和基准工具包的形式发布SUPERB,以应对挑战,以推动代表性学习和通用语音处理方面的研究。