当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Rotation-aware representation learning for remote sensing image retrieval
Information Sciences ( IF 8.1 ) Pub Date : 2021-04-30 , DOI: 10.1016/j.ins.2021.04.078
Zhi-Ze Wu , Chang Zou , Yan Wang , Ming Tan , Thomas Weise

The rising number and size of remote sensing (RS) image archives makes content-based RS image retrieval (CBRSIR) more important. Convolutional neural networks (CNNs) offer good CBRSIR performance, but the features they extract are not rotation-invariant. This is problematic as objects in RS images appear in arbitrary rotation angles. We develop and investigate two new rotation-aware CNN-based CBRSIR methods: 1) In the Feature Map Transformation Based Rotation-Aware Network (FMT-RAN), the last pooling layer is rotated in four different angles during training. Its outputs are passed through the same fully connected-, coding-, and classification layer, and the resulting losses are added. 2) The Spatial Transformer-based Rotation-Aware Network (ST-RAN) contains a spatial transformer network (STN) and a rotation aware network (RAN). For training, the original and a randomly rotated version of an image are fed into the ST-RAN. The STN generates a transformed version of the original to match the rotated image. The RAN extracts the features of all three images. We apply two-stage training, which first optimizes the STN and then the RAN. Both of our methods are efficient in terms of retrieval accuracy and time, but ST-RAN has the overall best performance. It outperforms the state-of-the-art CBRSIR methods.



中文翻译:

用于遥感图像检索的旋转感知表示学习

遥感 (RS) 图像档案的数量和规模不断增加,使得基于内容的 RS 图像检索 (CBRSIR) 变得更加重要。卷积神经网络 (CNN) 提供了良好的CBRSIR性能,但它们提取的特征不是旋转不变的。这是有问题的,因为 RS 图像中的对象以任意旋转角度出现。我们开发并研究了两种新的基于 CNN 的旋转感知 CBRSIR 方法:1) 在基于特征映射变换的旋转感知网络 (FMT-RAN) 中,最后一个池化层在训练期间以四个不同的角度旋转。它的输出通过相同的全连接、编码和分类层,并添加由此产生的损失。2)基于空间变换器的旋转感知网络(ST-RAN)包含一个空间变换器网络(STN)和一个旋转感知网络(RAN)。对于训练,图像的原始版本和随机旋转版本被输入 ST-RAN。STN 生成原始的变换版本以匹配旋转图像。RAN 提取所有三个图像的特征。我们应用两阶段训练,首先优化 STN,然后优化 RAN。我们的两种方法在检索精度和时间方面都很有效,但 ST-RAN 具有整体最佳性能。它优于最先进的 CBRSIR 方法。

更新日期:2021-06-05
down
wechat
bug