当前位置: X-MOL 学术IEEE J. Sel. Area. Comm. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A General 3D Space-Time-Frequency Non-Stationary THz Channel Model for 6G Ultra-Massive MIMO Wireless Communication Systems
IEEE Journal on Selected Areas in Communications ( IF 13.8 ) Pub Date : 2021-04-15 , DOI: 10.1109/jsac.2021.3071850
Jun Wang , Cheng-Xiang Wang , Jie Huang , Haiming Wang , Xiqi Gao

Weakly-supervised video object segmentation is an emerging video task to track and segment the target given a simple bounding box label, which requires the method to fully catch and utilize the target information. Most existing approaches only rely on the guidance of a single frame and ignore the interaction between different frames when gathering information, making them hard to achieve reliable target representation. In this paper, we propose to capture the temporal dependencies and gather information from multiple frames through bilateral temporal re-aggregation. We explore three schemes to build the aggregation: 1) a two-stage re-aggregation mechanism is applied to provide target prior to the current frame, which obtains more valid feature matching and information aggregation; 2) a query-memory bilateral aggregation module is proposed to aggregate features from an unlimited amount of past frames and enable the mutual perception between different frames to validate the gathered information; 3) we guide the learning of aggregation modules through a novel cross-task representation distillation, transferring the knowledge from a semi-supervised model to our weakly-supervised model without increasing the inference latency. These schemes collaboratively build an efficient and competent aggregation process, thus we can fully exploit the video context to make the inference. Experimental results on four benchmarks show that our method achieves superior performance than previous methods and still maintains the efficiency ( $e.g$ ., overall scores of 70.4% and 72.5% on the YouTube-VOS and DAVIS 2017 validation sets, respectively).

中文翻译:


6G 超大规模 MIMO 无线通信系统的通用 3D 时空频非平稳太赫兹信道模型



弱监督视频对象分割是一种新兴的视频任务,在给定简单的边界框标签的情况下跟踪和分割目标,这需要该方法充分捕获和利用目标信息。大多数现有方法在收集信息时仅依赖于单个框架的引导,而忽略了不同框架之间的交互,使得它们难以实现可靠的目标表示。在本文中,我们建议通过双边时间重新聚合来捕获时间依赖性并从多个帧收集信息。我们探索了三种构建聚合的方案:1)应用两阶段重新聚合机制来提供当前帧之前的目标,从而获得更有效的特征匹配和信息聚合; 2)提出了查询记忆双边聚合模块,用于聚合来自无限数量的过去帧的特征,并使不同帧之间能够相互感知以验证收集到的信息; 3)我们通过一种新颖的跨任务表示蒸馏来指导聚合模块的学习,将知识从半监督模型转移到我们的弱监督模型,而不增加推理延迟。这些方案共同构建了一个高效且有效的聚合过程,因此我们可以充分利用视频上下文来进行推理。四个基准测试的实验结果表明,我们的方法比以前的方法实现了更优越的性能,并且仍然保持了效率(例如,在 YouTube-VOS 和 DAVIS 2017 验证集上的总体得分分别为 70.4% 和 72.5%)。
更新日期:2021-04-15
down
wechat
bug