Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction,arXiv - CS - Robotics

当前位置： X-MOL 学术 › arXiv.cs.RO › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
arXiv - CS - Robotics Pub Date : 2020-05-18 , DOI: arxiv-2005.08514
Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, Shuai Yi

Understanding crowd motion dynamics is critical to real-world applications, e.g., surveillance systems and autonomous driving. This is challenging because it requires effectively modeling the socially aware crowd spatial interaction and complex temporal dependencies. We believe attention is the most important factor for trajectory prediction. In this paper, we present STAR, a Spatio-Temporal grAph tRansformer framework, which tackles trajectory prediction by only attention mechanisms. STAR models intra-graph crowd interaction by TGConv, a novel Transformer-based graph convolution mechanism. The inter-graph temporal dependencies are modeled by separate temporal Transformers. STAR captures complex spatio-temporal interactions by interleaving between spatial and temporal Transformers. To calibrate the temporal prediction for the long-lasting effect of disappeared pedestrians, we introduce a read-writable external memory module, consistently being updated by the temporal Transformer. We show that with only attention mechanism, STAR achieves state-of-the-art performance on 5 commonly used real-world pedestrian prediction datasets.

中文翻译：

用于行人轨迹预测的时空图变换器网络

了解人群运动动态对于现实世界的应用至关重要，例如监控系统和自动驾驶。这是具有挑战性的，因为它需要对具有社会意识的人群空间交互和复杂的时间依赖性进行有效建模。我们认为注意力是轨迹预测的最重要因素。在本文中，我们提出了 STAR，这是一个时空图变换器框架，它仅通过注意机制来处理轨迹预测。STAR 通过 TGConv 对图内人群交互进行建模，TGConv 是一种基于 Transformer 的新型图卷积机制。图间时间依赖性由单独的时间转换器建模。STAR 通过在空间和时间转换器之间交错来捕获复杂的时空交互。为了校准对消失的行人的长期影响的时间预测，我们引入了一个可读写的外部存储模块，由时间转换器持续更新。我们表明，仅通过注意力机制，STAR 在 5 个常用的现实世界行人预测数据集上实现了最先进的性能。

更新日期：2020-07-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>