当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Self Supervised Progressive Network for High Performance Video Object Segmentation.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2022-11-16 , DOI: 10.1109/tnnls.2022.3219936
Guorong Li 1 , Dexiang Hong 1 , Kai Xu 1 , Bineng Zhong 2 , Li Su 1 , Zhenjun Han 3 , Qingming Huang 1
Affiliation  

Recently, self-supervised video object segmentation (VOS) has attracted much interest. However, most proxy tasks are proposed to train only a single backbone, which relies on a point-to-point correspondence strategy to propagate masks through a video sequence. Due to its simple pipeline, the performance of the single backbone paradigm is still unsatisfactory. Instead of following the previous literature, we propose our self-supervised progressive network (SSPNet) which consists of a memory retrieval module (MRM) and collaborative refinement module (CRM). The MRM can perform point-to-point correspondence and produce a propagated coarse mask for a query frame through self-supervised pixel-level and frame-level similarity learning. The CRM, which is trained via cycle consistency region tracking, aggregates the reference & query information and learns the collaborative relationship among them implicitly to refine the coarse mask. Furthermore, to learn semantic knowledge from unlabeled data, we also design two novel mask-generation strategies to provide the training data with meaningful semantic information for the CRM. Extensive experiments conducted on DAVIS-17, YouTube-VOS and SegTrack v2 demonstrate that our method surpasses the state-of-the-art self-supervised methods and narrows the gap with the fully supervised methods.

中文翻译:

用于高性能视频对象分割的自监督渐进网络。

最近,自监督视频对象分割(VOS)引起了人们的极大兴趣。然而,大多数代理任务被提议只训练一个主干,它依赖于点对点对应策略通过视频序列传播掩码。由于其简单的流水线,单主干范式的性能仍然不能令人满意。我们没有遵循以前的文献,而是提出了自我监督的渐进式网络 (SSPNet),它由记忆检索模块 (MRM) 和协作改进模块 (CRM) 组成。MRM 可以执行点对点对应,并通过自监督像素级和帧级相似性学习为查询帧生成传播的粗掩码。通过循环一致性区域跟踪训练的 CRM 汇总参考和 查询信息并隐含地学习它们之间的协作关系以细化粗掩码。此外,为了从未标记的数据中学习语义知识,我们还设计了两种新颖的掩码生成策略,为 CRM 的训练数据提供有意义的语义信息。在 DAVIS-17、YouTube-VOS 和 SegTrack v2 上进行的大量实验表明,我们的方法超越了最先进的自监督方法,并缩小了与完全监督方法的差距。
更新日期:2022-11-16
down
wechat
bug