当前位置: X-MOL 学术Int. J. Appl. Earth Obs. Geoinf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hyper-ES2T: Efficient Spatial–Spectral Transformer for the classification of hyperspectral remote sensing images
International Journal of Applied Earth Observation and Geoinformation ( IF 7.5 ) Pub Date : 2022-09-10 , DOI: 10.1016/j.jag.2022.103005
Wenxuan Wang, Leiming Liu, Tianxiang Zhang, Jiachen Shen, Jing Wang, Jiangyun Li

In recent years, convolutional neural networks have continuously dominated the downstream tasks on hyperspectral remote sensing images with its strong local feature extraction capability. However, convolution operations cannot effectively capture the long-range dependencies and repeatedly stacking convolutional layers to pursue a hierarchical structure can only make this problem alleviated but not completely solved. Meantime, the appearance of Transformer happens to cope with this problem and provides an opportunity to capture long-distance dependencies between tokens. Although Transformer has been introduced into HSI classification field recently, most of these related works only focus on exploiting a single kind of spatial or spectral information and neglect to explore the optimal fusion method for these two different-level features. Therefore, to fully exploit the abundant spatial information and spectral correlations in HSIs in a highly effective and efficient way, we present the initial attempt to explore the Transformer architecture in a dual-branch manner and propose a novel bilateral classification network named Hyper-ES2T. Besides, the Aggregated Feature Enhancement Module is proposed for effective feature aggregation and further spatial–spectral feature enhancement. Furthermore, to tackle the problem of high computational costs brought by vanilla self-attention block in Transformer, we design the Efficient Multi-Head Self-Attention block, pursuing the trade-off between model accuracy and efficiency. The proposed Hyper-ES2T reaches new state-of-the-art performance and outperforms previous methods by a significant margin on four benchmark datasets for HSI classification, which demonstrates the powerful generalization ability and superior feature representation capability of our Hyper-ES2T. It can be anticipated that this work provides a novel insight to design network architecture based on Transformer with superior performance and great model efficiency, which may inspire more following research in this direction of HSI processing field. The source codes will be available at https://github.com/Wenxuan-1119/Hyper-ES2T.



中文翻译:

Hyper-ES2T:用于高光谱遥感图像分类的高效空间光谱转换器

近年来,卷积神经网络以其强大的局部特征提取能力,不断主导高光谱遥感图像的下游任务。然而,卷积操作并不能有效地捕捉长程依赖关系,反复堆叠卷积层以追求层次结构只能使这个问题得到缓解,但不能完全解决。与此同时,Transformer 的出现恰好解决了这个问题,并提供了一个机会来捕获令牌之间的远距离依赖关系。尽管最近将 Transformer 引入 HSI 分类领域,但这些相关工作大多只专注于利用单一类型的空间或光谱信息,而忽略了探索这两种不同级别特征的最佳融合方法。所以,超 ES 2 T。此外,为了有效的特征聚合和进一步的空间光谱特征增强,提出了聚合特征增强模块。此外,为了解决 Transformer 中 vanilla self-attention 块带来的高计算成本问题,我们设计了Efficient Multi-Head Self-Attention块,追求模型准确性和效率之间的权衡。所提出的 Hyper-ES 2 T 在 HSI 分类的四个基准数据集上达到了新的最先进的性能,并显着优于以前的方法,这证明了我们的 Hyper-ES 2强大的泛化能力和卓越的特征表示能力T. 可以预见,这项工作为基于 Transformer 的网络架构设计提供了新的见解,具有卓越的性能和良好的模型效率,可能会激发 HSI 处理领域这一方向的更多后续研究。源代码将在 https://github.com/Wenxuan-1119/Hyper-ES2T 提供。

更新日期:2022-09-10
down
wechat
bug