当前位置: X-MOL 学术Signal Process. Image Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Double constrained bag of words for human action recognition
Signal Processing: Image Communication ( IF 3.4 ) Pub Date : 2021-08-04 , DOI: 10.1016/j.image.2021.116399
Chao Wu 1 , Yaqian Li 2 , Yaru Zhang 1 , Bin Liu 1
Affiliation  

Various improved methods based on the strategy of bag of words (BoW) are widely used to solve the problem of human action recognition. However, the spatial relationship between features is measured and utilized by these methods in a relatively single way. It limits the recognition performance of these methods. To solve this problem, double constrained bag of words (DC-BoW) is proposed to utilize the spatial distribution information between features belonging to three levels, which include descriptor-level, presentation-level and hidden layer features. Aiming at the problem that most coding methods only rely on Euclidean distance to constrain the relationship between descriptor-level features, the constraints of the difference in length and cosine of angle between visual word and local feature are designed to construct the loss function to obtain the length and angle constrained linear coding (LACLC) method. In order to improve the recognizability of the representation-level features, the spatial distribution between the encoded features around each cluster center is considered. Hierarchical weighting and LACLC are jointly applied to the distribution to construct aggregated word group feature (AWGF). At the same time, the constraint form of correntropy is changed according to the principle of constructing constraints in LACLC. The hidden layer features are combined with new constraint forms to construct double constrained extreme learning machine (DC-ELM), which improves the classification performance of the network while avoiding iterative training of correntropy weight. In order to verify the feasibility of DC-BoW, experiments are conducted on KTH, Olympic Sports, UCF11, Hollywood2 and UCF101 datasets. Experimental results show that the proposed DC-BoW can further utilize the spatial distribution information between features to obtain excellent recognition accuracy compared with other improved methods based on BoW.



中文翻译:

用于人类动作识别的双约束词袋

各种基于词袋策略(BoW)的改进方法被广泛用于解决人类动作识别问题。然而,这些方法以相对单一的方式测量和利用特征之间的空间关系。它限制了这些方法的识别性能。为了解决这个问题,提出了双约束词袋(DC-BoW)来利用属于三个层次的特征之间的空间分布信息,包括描述符级、表示级和隐藏层特征。针对大多数编码方法仅依靠欧氏距离来约束描述符级特征之间关系的问题,设计视觉词与局部特征的长度差和角度余弦的约束,构造损失函数,得到长度角度约束线性编码(LACLC)方法。为了提高表征级特征的可识别性,考虑了每个聚类中心周围的编码特征之间的空间分布。分层加权和 LACLC 联合应用于分布以构建聚合词组特征 (AWGF)。同时,根据LACLC中构造约束的原则,改变了相关性的约束形式。隐层特征结合新的约束形式构造双约束极限学习机(DC-ELM),这提高了网络的分类性能,同时避免了相关熵权重的迭代训练。为了验证 DC-BoW 的可行性,在 KTH、Olympic Sports、UCF11、Hollywood2 和 UCF101 数据集上进行了实验。实验结果表明,与其他基于 BoW 的改进方法相比,所提出的 DC-BoW 可以进一步利用特征之间的空间分布信息,获得优异的识别精度。

更新日期:2021-08-10
down
wechat
bug