当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detection of Audio-Video Synchronization Errors Via Event Detection
arXiv - CS - Multimedia Pub Date : 2021-04-20 , DOI: arxiv-2104.10116
Joshua P. Ebenezer, Yongjun Wu, Hai Wei, Sriram Sethuraman, Zongyi Liu

We present a new method and a large-scale database to detect audio-video synchronization(A/V sync) errors in tennis videos. A deep network is trained to detect the visual signature of the tennis ball being hit by the racquet in the video stream. Another deep network is trained to detect the auditory signature of the same event in the audio stream. During evaluation, the audio stream is searched by the audio network for the audio event of the ball being hit. If the event is found in audio, the neighboring interval in video is searched for the corresponding visual signature. If the event is not found in the video stream but is found in the audio stream, A/V sync error is flagged. We developed a large-scaled database of 504,300 frames from 6 hours of videos of tennis events, simulated A/V sync errors, and found our method achieves high accuracy on the task.

中文翻译:

通过事件检测来检测音频-视频同步错误

我们提出了一种新的方法和大型数据库来检测网球视频中的音频-视频同步(A / V同步)错误。训练了一个深层网络,以检测视频流中被球拍击中的网球的视觉特征。另一个深度网络经过训练,可以检测音频流中同一事件的听觉签名。在评估期间,音频网络会在音频流中搜索击打球的音频事件。如果在音频中找到事件,则在视频的相邻间隔中搜索相应的视觉签名。如果在视频流中找不到事件,但在音频流中找到事件,则会标记A / V同步错误。我们通过6小时的网球赛事视频,模拟的A / V同步错误,
更新日期:2021-04-21
down
wechat
bug