当前位置: X-MOL 学术Appl. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scene Recognition by Joint Learning of DNN from Bag of Visual Words and Convolutional DCT Features
Applied Artificial Intelligence ( IF 2.9 ) Pub Date : 2021-05-25 , DOI: 10.1080/08839514.2021.1881296
Abdul Rehman 1 , Summra Saleem 1 , Usman Ghani Khan 1 , Saira Jabeen 1 , M. Omair Shafiq 2
Affiliation  

ABSTRACT

Scene recognition is used in many computer vision and related applications, including information retrieval, robotics, real-time monitoring, and event-classification. Due to the complex nature of the task of scene recognition, it has been greatly improved by deep learning architectures that can be trained by utilizing large and comprehensive datasets. This paper presents a scene classification method in which local and global features are used and are concatenated with the DCT-Convolutional features of AlexNet. The features are fed into AlexNet's fully connected layers for classification. The local and global features are made efficient by selecting the correct size of Bag of Visual Words (BOVW) and feature selection techniques, which are evaluated in the experimentation section. We used AlexNet with the modification of adding additional dense fully connected layers and compared its result with the model previously trained on the Places365 dataset. Our model is also compared with other scene recognition methods, and it clearly outperforms in terms of accuracy.



中文翻译:

从视觉词袋和卷积 DCT 特征中联合学习 DNN 的场景识别

摘要

场景识别用于许多计算机视觉和相关应用,包括信息检索、机器人、实时监控和事件分类。由于场景识别任务的复杂性,可以通过利用大型综合数据集进行训练的深度学习架构对其进行了极大的改进。本文提出了一种场景分类方法,其中使用局部和全局特征,并与 AlexNet 的 DCT-Convolutional 特征连接。这些特征被送入 AlexNet 的全连接层进行分类。通过选择正确大小的视觉词袋 (BOVW) 和特征选择技术,局部和全局特征变得高效,这些技术在实验部分进行了评估。我们使用 AlexNet 并添加了额外的密集全连接层,并将其结果与之前在 Places365 数据集上训练的模型进行了比较。我们的模型还与其他场景识别方法进行了比较,它在准确性方面明显优于其他方法。

更新日期:2021-06-19
down
wechat
bug