当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion
arXiv - CS - Computation and Language Pub Date : 2020-06-25 , DOI: arxiv-2006.14194 Alex Sokolov, Tracy Rohlin, Ariya Rastrow
arXiv - CS - Computation and Language Pub Date : 2020-06-25 , DOI: arxiv-2006.14194 Alex Sokolov, Tracy Rohlin, Ariya Rastrow
Grapheme-to-phoneme (G2P) models are a key component in Automatic Speech
Recognition (ASR) systems, such as the ASR system in Alexa, as they are used to
generate pronunciations for out-of-vocabulary words that do not exist in the
pronunciation lexicons (mappings like "e c h o" to "E k oU"). Most G2P systems
are monolingual and based on traditional joint-sequence based n-gram models
[1,2]. As an alternative, we present a single end-to-end trained neural G2P
model that shares same encoder and decoder across multiple languages. This
allows the model to utilize a combination of universal symbol inventories of
Latin-like alphabets and cross-linguistically shared feature representations.
Such model is especially useful in the scenarios of low resource languages and
code switching/foreign words, where the pronunciations in one language need to
be adapted to other locales or accents. We further experiment with word
language distribution vector as an additional training target in order to
improve system performance by helping the model decouple pronunciations across
a variety of languages in the parameter space. We show 7.2% average improvement
in phoneme error rate over low resource languages and no degradation over high
resource ones compared to monolingual baselines.
更新日期:2020-06-30