Asymmetric Contrastive Learning for Audio Fingerprinting,IEEE Signal Processing Letters

当前位置： X-MOL 学术 › IEEE Signal Process. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Asymmetric Contrastive Learning for Audio Fingerprinting
IEEE Signal Processing Letters ( IF 3.9 ) Pub Date : 2022-08-24 , DOI: 10.1109/lsp.2022.3201430
Xinyu Wu ₁ , Hongxia Wang ₁

Affiliation

Audio fingerprinting methods can compress audio contents into compact signatures so that we can save storage and reduce query time. This technology is widely used in many fields, such as audio retrieval, music information retrieval and audio authentication. However, most of the existing methods cannot balance the recognition accuracy, query speed and storage size well. This letter presents a novel self-supervised learning scheme called asymmetric contrastive learning to generate binary hash fingerprints of audio segments. Meanwhile, we design a new loss function named bidirectional asymmetric pairwise loss to minimize the loss of information. Experimental results show that our scheme can achieve a high top-1 hit rate on both music and speech datasets. Furthermore, the proposed scheme outperforms the previous work of real-value fingerprinting in query speed and storage size.

中文翻译：

音频指纹识别的不对称对比学习

音频指纹方法可以将音频内容压缩成紧凑的签名，这样我们就可以节省存储空间并减少查询时间。该技术广泛应用于音频检索、音乐信息检索、音频认证等诸多领域。然而，现有的方法大多不能很好地平衡识别精度、查询速度和存储大小。这封信提出了一种新颖的自我监督学习方案，称为非对称对比学习，以生成音频片段的二进制哈希指纹。同时，我们设计了一种新的损失函数，称为双向非对称成对损失，以最大限度地减少信息损失。实验结果表明，我们的方案可以在音乐和语音数据集上实现较高的 top-1 命中率。此外，

更新日期：2022-08-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>