当前位置: X-MOL 学术IEEE Robot. Automation Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Segmenting the Future
IEEE Robotics and Automation Letters ( IF 5.2 ) Pub Date : 2020-07-01 , DOI: 10.1109/lra.2020.2992184
Hsu-kuang Chiu , Ehsan Adeli , Juan Carlos Niebles

Predicting the future is an important aspect for decision-making in robotics or autonomous driving systems, which heavily rely upon visual scene understanding. While prior work attempts to predict future video pixels, anticipate activities or forecast future scene semantic segments from segmentation of the preceding frames, methods that predict future semantic segmentation solely from the previous frame RGB data in a single end-to-end trainable model do not exist. In this letter, we propose a temporal encoder-decoder network architecture that encodes RGB frames from the past and decodes the future semantic segmentation. The network is coupled with a new knowledge distillation training framework specific for the forecasting task. Our method, only seeing preceding video frames, implicitly models the scene segments while simultaneously accounting for the object dynamics to infer the future scene semantic segments. Our results on Cityscapes and Apolloscape outperform the baseline and current state-of-the-art methods. Code will be available soon.

中文翻译:

细分未来

预测未来是机器人或自动驾驶系统决策的一个重要方面,这在很大程度上依赖于视觉场景理解。虽然先前的工作试图通过对前一帧的分割来预测未来的视频像素、预测活动或预测未来的场景语义片段,但仅从单个端到端可训练模型中的前一帧 RGB 数据预测未来语义分割的方法不会存在。在这封信中,我们提出了一种时间编码器-解码器网络架构,该架构对过去的 RGB 帧进行编码并解码未来的语义分割。该网络与特定于预测任务的新知识蒸馏训练框架相结合。我们的方法,只看到前面的视频帧,隐式地对场景片段进行建模,同时考虑对象动态以推断未来的场景语义片段。我们在 Cityscapes 和 Apolloscape 上的结果优于基线和当前最先进的方法。代码将很快可用。
更新日期:2020-07-01
down
wechat
bug