当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Continual Learning for Automated Audio Captioning Using The Learning Without Forgetting Approach
arXiv - CS - Sound Pub Date : 2021-07-16 , DOI: arxiv-2107.08028
Jan Berg, Konstantinos Drossos

Automated audio captioning (AAC) is the task of automatically creating textual descriptions (i.e. captions) for the contents of a general audio signal. Most AAC methods are using existing datasets to optimize and/or evaluate upon. Given the limited information held by the AAC datasets, it is very likely that AAC methods learn only the information contained in the utilized datasets. In this paper we present a first approach for continuously adapting an AAC method to new information, using a continual learning method. In our scenario, a pre-optimized AAC method is used for some unseen general audio signals and can update its parameters in order to adapt to the new information, given a new reference caption. We evaluate our method using a freely available, pre-optimized AAC method and two freely available AAC datasets. We compare our proposed method with three scenarios, two of training on one of the datasets and evaluating on the other and a third of training on one dataset and fine-tuning on the other. Obtained results show that our method achieves a good balance between distilling new knowledge and not forgetting the previous one.

中文翻译:

使用不忘记学习方法进行自动音频字幕的持续学习

自动音频字幕 (AAC) 是为一般音频信号的内容自动创建文本描述(即字幕)的任务。大多数 AAC 方法使用现有数据集进行优化和/或评估。鉴于 AAC 数据集拥有的信息有限,AAC 方法很可能只学习所用数据集中包含的信息。在本文中,我们提出了第一种使用持续学习方法使 AAC 方法不断适应新信息的方法。在我们的场景中,预先优化的 AAC 方法用于一些看不见的一般音频信号,并且可以更新其参数以适应新信息,给定新的参考标题。我们使用免费提供的、预先优化的 AAC 方法和两个免费提供的 AAC 数据集来评估我们的方法。我们将我们提出的方法与三个场景进行比较,两个场景是在一个数据集上进行训练并在另一个上进行评估,在一个数据集上进行三分之一的训练并在另一个上进行微调。获得的结果表明,我们的方法在提炼新知识和不忘记前一个知识之间取得了很好的平衡。
更新日期:2021-07-19
down
wechat
bug