当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transformer-based multimodal change detection with multitask consistency constraints
Information Fusion ( IF 18.6 ) Pub Date : 2024-03-24 , DOI: 10.1016/j.inffus.2024.102358
Biyuan Liu , Huaixin Chen , Kun Li , Michael Ying Yang

Change detection plays a fundamental role in Earth observation for analyzing temporal iterations over time. However, recent studies have largely neglected the utilization of multimodal data that presents significant practical and technical advantages compared to single-modal approaches. This research focuses on leveraging pre-event digital surface model (DSM) data and post-event digital aerial images captured at different times for detecting change beyond 2D. We observe that the current change detection methods struggle with the multitask conflicts between semantic and height change detection tasks. To address this challenge, we propose an efficient Transformer-based network that learns shared representation between cross-dimensional inputs through cross-attention. It adopts a consistency constraint to establish the multimodal relationship. Initially, pseudo-changes are derived by employing height change thresholding. Subsequently, the distance between semantic and pseudo-changes within their overlapping regions is minimized. This explicitly endows the height change detection (regression task) and semantic change detection (classification task) with representation consistency. A DSM-to-image multimodal dataset encompassing three cities in the Netherlands was constructed. It lays a new foundation for beyond-2D change detection from cross-dimensional inputs. Compared to five state-of-the-art change detection methods, our model demonstrates consistent multitask superiority in terms of semantic and height change detection. Furthermore, the consistency strategy can be seamlessly adapted to the other methods, yielding promising improvements.

中文翻译:

具有多任务一致性约束的基于 Transformer 的多模态变化检测

变化检测在分析随时间变化的时间迭代的地球观测中发挥着基础作用。然而,最近的研究在很大程度上忽略了多模态数据的利用,与单模态方法相比,多模态数据具有显着的实用和技术优势。这项研究的重点是利用事件前数字表面模型 (DSM) 数据和事件后在不同时间捕获的数字航空图像来检测二维以外的变化。我们观察到当前的变化检测方法正在努力解决语义和高度变化检测任务之间的多任务冲突。为了应对这一挑战,我们提出了一种基于 Transformer 的高效网络,该网络通过交叉注意力学习跨维度输入之间的共享表示。它采用一致性约束来建立多模态关系。最初,通过采用高度变化阈值来导出伪变化。随后,重叠区域内的语义变化和伪变化之间的距离被最小化。这明确赋予了高度变化检测(回归任务)和语义变化检测(分类任务)具有表示一致性。构建了涵盖荷兰三个城市的 DSM 到图像多模态数据集。它为跨维度输入的超二维变化检测奠定了新的基础。与五种最先进的变化检测方法相比,我们的模型在语义和高度变化检测方面表现出一致的多任务优势。此外,一致性策略可以无缝地适应其他方法,从而产生有希望的改进。
更新日期:2024-03-24
down
wechat
bug