当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Space-Time Tree Ensemble for Action Recognition and Localization
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2017-02-02 , DOI: 10.1007/s11263-016-0980-8
Shugao Ma , Jianming Zhang , Stan Sclaroff , Nazli Ikizler-Cinbis , Leonid Sigal

Human actions are, inherently, structured patterns of body movements. We explore ensembles of hierarchical spatio-temporal trees, discovered directly from training data, to model these structures for action recognition and spatial localization. Discovery of frequent and discriminative tree structures is challenging due to the exponential search space, particularly if one allows partial matching. We address this by first building a concise action word vocabulary via discriminative clustering of the hierarchical space-time segments, which is a two-level video representation that captures both static and non-static relevant space-time segments of the video. Using this vocabulary we then utilize tree mining with subsequent tree clustering and ranking to select a compact set of discriminative tree patterns. Our experiments show that these tree patterns, alone, or in combination with shorter patterns (action words and pairwise patterns) achieve promising performance on three challenging datasets: UCF Sports, HighFive and Hollywood3D. Moreover, we perform cross-dataset validation, using trees learned on HighFive to recognize the same actions in Hollywood3D, and using trees learned on UCF-Sports to recognize and localize the similar actions in JHMDB. The results demonstrate the potential for cross-dataset generalization of the trees our approach discovers.

中文翻译:

用于动作识别和定位的时空树集成

人类行为本质上是身体运动的结构化模式。我们探索直接从训练数据中发现的分层时空树的集合,以对这些结构进行建模以进行动作识别和空间定位。由于指数搜索空间,发现频繁和有区别的树结构具有挑战性,尤其是在允许部分匹配的情况下。我们首先通过分层时空段的判别聚类构建简洁的动作词词汇表来解决这个问题,这是一个两级视频表示,捕获视频的静态和非静态相关时空段。使用这个词汇,我们然后利用树挖掘和随后的树聚类和排名来选择一组紧凑的判别树模式。我们的实验表明,这些树模式,单独或与较短的模式(动作词和成对模式)相结合,在三个具有挑战性的数据集上取得了可喜的表现:UCF Sports、HighFive 和 Hollywood3D。此外,我们执行跨数据集验证,使用在 HighFive 上学习的树来识别 Hollywood3D 中的相同动作,并使用在 UCF-Sports 上学习的树来识别和定位 JHMDB 中的相似动作。结果证明了我们的方法发现的树的跨数据集泛化的潜力。并使用在 UCF-Sports 上学习的树来识别和定位 JHMDB 中的类似动作。结果证明了我们的方法发现的树的跨数据集泛化的潜力。并使用在 UCF-Sports 上学习的树来识别和定位 JHMDB 中的类似动作。结果证明了我们的方法发现的树的跨数据集泛化的潜力。
更新日期:2017-02-02
down
wechat
bug