当前位置: X-MOL 学术ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fine-Grained Visual Computing Based on Deep Learning
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.2 ) Pub Date : 2021-04-27 , DOI: 10.1145/3418215
Zhihan Lv 1 , Liang Qiao 1 , Amit Kumar Singh 2 , Qingjun Wang 3
Affiliation  

With increasing amounts of information, the image information received by people also increases exponentially. To perform fine-grained categorization and recognition of images and visual calculations, this study combines the Visual Geometry Group Network 16 model of convolutional neural networks and the vision attention mechanism to build a multi-level fine-grained image feature categorization model. Finally, the TensorFlow platform is utilized to simulate the fine-grained image classification model based on the visual attention mechanism. The results show that in terms of accuracy and required training time, the fine-grained image categorization effect of the multi-level feature categorization model constructed by this study is optimal, with an accuracy rate of 85.3% and a minimum training time of 108 s. In the similarity effect analysis, it is found that the chi-square distance between Log Gabor features and the degree of image distortion show a strong positive correlation; in addition, the validity of this measure is verified. Therefore, through the research in this study, it is found that the constructed fine-grained image categorization model has higher accuracy in image recognition categorization, shorter training time, and significantly better performance in similar feature effects, which provides an experimental reference for the visual computing of fine-grained images in the future.

中文翻译:

基于深度学习的细粒度视觉计算

随着信息量的增加,人们接收到的图像信息也呈指数级增长。为了对图像进行细粒度的分类识别和视觉计算,本研究结合卷积神经网络的Visual Geometry Group Network 16模型和视觉注意机制,构建了一个多层次的细粒度图像特征分类模型。最后,利用TensorFlow平台模拟基于视觉注意机制的细粒度图像分类模型。结果表明,在准确率和所需训练时间方面,本研究构建的多级特征分类模型的细粒度图像分类效果最优,准确率为85.3%,最短训练时间为108 s . 在相似效应分析中,发现Log Gabor特征间的卡方距离与图像畸变程度呈强正相关;此外,验证了该措施的有效性。因此,通过本研究的研究发现,所构建的细粒度图像分类模型在图像识别分类中具有更高的准确率、更短的训练时间、在相似特征效果上的表现明显更好,这为视觉识别提供了实验参考。未来细粒度图像的计算。
更新日期:2021-04-27
down
wechat
bug