当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Prediction of Punctuation and Truecasing for Medical ASR
arXiv - CS - Sound Pub Date : 2020-07-04 , DOI: arxiv-2007.02025
Monica Sunkara, Srikanth Ronanki, Kalpit Dixit, Sravan Bodapati, Katrin Kirchhoff

Automatic speech recognition (ASR) systems in the medical domain that focus on transcribing clinical dictations and doctor-patient conversations often pose many challenges due to the complexity of the domain. ASR output typically undergoes automatic punctuation to enable users to speak naturally, without having to vocalise awkward and explicit punctuation commands, such as "period", "add comma" or "exclamation point", while truecasing enhances user readability and improves the performance of downstream NLP tasks. This paper proposes a conditional joint modeling framework for prediction of punctuation and truecasing using pretrained masked language models such as BERT, BioBERT and RoBERTa. We also present techniques for domain and task specific adaptation by fine-tuning masked language models with medical domain data. Finally, we improve the robustness of the model against common errors made in ASR by performing data augmentation. Experiments performed on dictation and conversational style corpora show that our proposed model achieves ~5% absolute improvement on ground truth text and ~10% improvement on ASR outputs over baseline models under F1 metric.

中文翻译:

医学 ASR 标点符号和 Truecasing 的稳健预测

由于领域的复杂性,医学领域中的自动语音识别 (ASR) 系统专注于转录临床听写和医患对话,通常会带来许多挑战。ASR 输出通常经过自动标点符号,使用户能够自然地说话,而无需发出笨拙而明确的标点命令,例如“句号”、“添加逗号”或“感叹号”,而 truecasing 增强了用户可读性并提高了下游的性能NLP 任务。本文提出了一种条件联合建模框架,用于使用预训练的掩码语言模型(如 BERT、BioBERT 和 RoBERTa)预测标点符号和 truecasing。我们还通过使用医学领域数据对掩码语言模型进行微调,展示了领域和任务特定适应的技术。最后,我们通过执行数据增强来提高模型对 ASR 中常见错误的鲁棒性。在听写和会话风格语料库上进行的实验表明,我们提出的模型在真实文本上实现了约 5% 的绝对改进,在 F1 度量下的基线模型上实现了约 10% 的 ASR 输出改进。
更新日期:2020-07-14
down
wechat
bug