当前位置: X-MOL 学术IEEE Trans. Geosci. Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multimodal Semantic Consistency-Based Fusion Architecture Search for Land Cover Classification
IEEE Transactions on Geoscience and Remote Sensing ( IF 8.2 ) Pub Date : 2022-07-21 , DOI: 10.1109/tgrs.2022.3193273
Xiao Li 1 , Lin Lei 1 , Caiguang Zhang 1 , Gangyao Kuang 1
Affiliation  

Multimodal land cover classification (MLCC) using the optical and synthetic aperture radar (SAR) modalities has resulted in outstanding performances over using only unimodal data due to their complementary information on land properties. Previous multimodal deep learning (MDL) methods have relied on handcrafted multibranch convolutional neural networks (CNN) to extract the features of different modalities and merged them for land cover classification. However, natural-image-oriented handcrafted CNN models may not be the optimal strategies to handle remote sensing (RS) image interpretation problems, due to the huge difference in terms of imaging angles and imaging ways. Furthermore, few MDL methods have analyzed optimal combinations of hierarchical features from different modalities. In this article, we propose an efficient multimodal architecture search framework, namely, multimodal semantic consistency-based fusion architecture search ( $\text{M}^{2}$ SC-FAS) in continuous search space with the gradient-based optimization method, which can not only discover optimal optical- and SAR-specific architectures according to the different characteristics of the optical and SAR images, respectively, but also realizes the search of optimal multimodal dense fusion architecture. Specifically, the semantic consistency constraint is introduced to guarantee dense fusion between hierarchical optical and SAR features with high semantic consistency and then capture the complementary performance on land properties. Finally, the basis of curriculum learning strategy is adopted on $\text{M}^{2}$ SC-FAS. Extensive experiments show superior performances of our work on three broad coregistered optical and SAR datasets.

中文翻译:

基于多模态语义一致性的土地覆盖分类融合架构搜索

使用光学和合成孔径雷达 (SAR) 模式的多模态土地覆盖分类 (MLCC) 与仅使用单模态数据相比具有出色的性能,因为它们具有关于土地属性的互补信息。以前的多模态深度学习 (MDL) 方法依赖于手工制作的多分支卷积神经网络 (CNN) 来提取不同模态的特征并将它们合并以进行土地覆盖分类。然而,由于成像角度和成像方式的巨大差异,面向自然图像的手工 CNN 模型可能不是处理遥感 (RS) 图像解释问题的最佳策略。此外,很少有 MDL 方法分析了来自不同模式的分层特征的最佳组合。在本文中, $\文本{M}^{2}$SC-FAS)在连续搜索空间中采用基于梯度的优化方法,不仅可以根据光学和SAR图像的不同特征分别发现最优的光学和SAR特定架构,而且实现了最优的搜索多模态密集融合架构。具体来说,引入语义一致性约束以保证层次光学和具有高度语义一致性的 SAR 特征之间的密集融合,然后捕获土地属性的互补性能。最后,采用课程学习策略的基础 $\文本{M}^{2}$SC-FAS。广泛的实验表明我们在三个广泛的共配准光学和 SAR 数据集上的工作具有卓越的性能。
更新日期:2022-07-21
down
wechat
bug