A very high-resolution scene classification model using transfer deep CNNs based on saliency features,Signal, Image and Video Processing

当前位置： X-MOL 学术 › Signal Image Video Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A very high-resolution scene classification model using transfer deep CNNs based on saliency features
Signal, Image and Video Processing ( IF 2.0 ) Pub Date : 2020-10-17 , DOI: 10.1007/s11760-020-01801-5
Osama A. Shawky , Ahmed Hagag , El-Sayed A. El-Dahshan , Manal A. Ismail

Developing remote sensing technology enables the production of very high-resolution (VHR) images. Classification of the VHR imagery scene has become a challenging problem. In this paper, we propose a model for VHR scene classification. First, convolutional neural networks (CNNs) with pre-trained weights are used as a deep feature extractor to extract the global and local CNNs features from the original VHR images. Second, the spectral residual-based saliency detection algorithm is used to extract the saliency map. Then, saliency features from the saliency map are extracted using CNNs in order to extract robust features for the VHR imagery, especially for the image with salience object. Third, we use the feature fusion technique rather than the raw deep features to represent the final shape of the VHR image scenes. In feature fusion, discriminant correlation analysis (DCA) is used to fuse both the global and local CNNs features and saliency features. DCA is a more suitable and cost-effective fusion method than the traditional fusion techniques. Finally, we propose an enhanced multilayer perceptron to classify the image. Experiments are performed on four widely used datasets: UC-Merced, WHU-RS, Aerial Image, and NWPU-RESISC45. Results confirm that the proposed model performs better than state-of-the-art scene classification models.

中文翻译：

使用基于显着特征的传输深CNN的超高分辨率场景分类模型

不断发展的遥感技术可以产生非常高分辨率（VHR）的图像。VHR影像场景的分类已成为一个具有挑战性的问题。在本文中，我们提出了一个VHR场景分类模型。首先，具有预训练权重的卷积神经网络（CNN）用作深度特征提取器，以从原始VHR图像中提取全局和局部CNN特征。其次，基于频谱残差的显着性检测算法用于提取显着图。然后，使用CNN从显着图提取显着特征，以便为VHR图像（尤其是具有显着对象的图像）提取鲁棒特征。第三，我们使用特征融合技术而不是原始的深层特征来表示VHR图像场景的最终形状。在特征融合中判别相关分析（DCA）用于融合全局和局部CNN特征和显着性特征。与传统的融合技术相比，DCA是一种更合适且更具成本效益的融合方法。最后，我们提出了一种增强的多层感知器来对图像进行分类。在四个广泛使用的数据集上进行了实验：UC-Merced，WHU-RS，航空图像和NWPU-RESISC45。结果证实，所提出的模型比最新的场景分类模型表现更好。航空影像和NWPU-RESISC45。结果证实，所提出的模型比最新的场景分类模型表现更好。航空影像和NWPU-RESISC45。结果证实，所提出的模型比最新的场景分类模型表现更好。

更新日期：2020-10-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文