SSFNET-VOS: Semantic segmentation and fusion network for video object segmentation,Pattern Recognition Letters

当前位置： X-MOL 学术 › Pattern Recogn. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SSFNET-VOS: Semantic segmentation and fusion network for video object segmentation
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-09-24 , DOI: 10.1016/j.patrec.2020.09.028
Vipal Kumar Sharma , Roohie Naaz Mir

Most of the recent successful approaches for video object segmentation are highly complex, heavily rely on fine-tuning of the first frame, are slow, and henceforth, are of constrained practical use. In this work, we introduce a novel approach of video object segmentation using unsupervised learning. The complete process is divided into two phases where base frame and current frame are considered for segmentation. In the first phase, we generate the coarse region proposals, bounding boxes and scores, then in the next phase, the feature extraction process is carried out where attention network is incorporated for feature encoding. Finally, these features are scaled and fused using Softmax operation to generate the object segmentation. The performance of proposed approach is compared with several state-of-art techniques on challenging DAVIS 2016 & 2017 datasets. The experimental study shows that the proposed semantic segmentation and fusion network for video object segmentation (SSFNET-VOS) achieves better segmentation with less error in terms of segmentation accuracy.

中文翻译：

SSFNET-VOS：用于视频对象分割的语义分割和融合网络

最近用于视频对象分割的大多数成功方法非常复杂，严重依赖于第一帧的微调，速度较慢，因此，其实际使用受到限制。在这项工作中，我们介绍了一种使用无监督学习的视频对象分割新方法。完整的过程分为两个阶段，其中考虑对基本帧和当前帧进行分段。在第一阶段，我们生成粗略区域建议，边界框和得分，然后在下一阶段，执行特征提取过程，其中将注意力网络合并到特征编码中。最后，使用Softmax操作对这些特征进行缩放和融合，以生成对象分割。所提出的方法的性能与具有挑战性的DAVIS 2016的几种最新技术进行了比较 2017年数据集实验研究表明，所提出的用于视频对象分割的语义分割和融合网络（SSFNET-VOS）在分割准确度方面实现了更好的分割，且误差较小。

更新日期：2020-10-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11