ISPRS Journal of Photogrammetry and Remote Sensing ( IF 12.7 ) Pub Date : 2021-08-19 , DOI: 10.1016/j.isprsjprs.2021.08.004 D. Wittich 1 , F. Rottensteiner 1
This paper addresses appearance based domain adaptation for the pixel-wise classification of remotely sensed data using deep neural networks (DNN) as a strategy to reduce the requirements of DNN with respect to the availability of training data. We focus on the setting in which labelled data are only available in a source domain , but not in a target domain , known as unsupervised domain adaptation in Computer Vision. Our method is based on adversarial training of an appearance adaptation network (AAN) that transforms images from such that they look like images from . Together with the original label maps from , the transformed images are used to adapt a DNN to . The AAN has to change the appearance of objects of a certain class such that they resemble objects of the same class in . Many approaches try to achieve this goal by incorporating cycle consistency in the adaptation process, but such approaches tend to hallucinate structures that occur frequently in one of the domains. In contrast, we propose a joint training strategy of the AAN and the classifier, which constrains the AAN to transform the images such that they are correctly classified. To further improve the adaptation performance, we propose a new regularization loss for the discriminator network used in adversarial training. We also address the problem of finding the optimal values of the trained network parameters, proposing a new unsupervised entropy based parameter selection criterion, which compensates for the fact that there is no validation set in that could be monitored. As a minor contribution, we present a new weighting strategy for the cross-entropy loss, addressing the problem of imbalanced class distributions. Our method is evaluated in 42 adaptation scenarios using datasets from 7 cities, all consisting of high-resolution digital orthophotos and height data. It achieves a positive transfer in all cases, and on average it improves the performance in the target domain by in overall accuracy. In adaptation scenarios between the Vaihingen and Potsdam datasets from the ISPRS semantic labelling benchmark our method outperforms those from recent publications by with respect to the mean intersection over union.
中文翻译:
用于航空影像分类的基于外观的深度域自适应
本文使用深度神经网络 (DNN) 作为减少 DNN 对训练数据可用性要求的策略,解决了基于外观的域适应,用于遥感数据的像素级分类。我们专注于标记数据仅在源域中可用的设置, 但不在目标域中 ,在计算机视觉中称为无监督域适应。我们的方法基于外观适应网络 (AAN) 的对抗训练,该网络将图像从 使它们看起来像来自 . 连同来自的原始标签映射,转换后的图像用于使 DNN 适应 . AAN 必须改变某个类的对象的外观,使它们类似于同一类的对象. 许多方法试图通过在适应过程中结合循环一致性来实现这一目标,但这些方法往往会产生在某个领域中频繁出现的结构的幻觉。相比之下,我们提出了 AAN 和分类器的联合训练策略,该策略约束 AAN 对图像进行转换,使其正确分类。为了进一步提高适应性能,我们为对抗训练中使用的鉴别器网络提出了一种新的正则化损失。我们还解决了寻找训练过的网络参数的最优值的问题,提出了一种新的基于无监督熵的参数选择标准,它弥补了没有验证集的事实可以被监控。作为一个小贡献,我们提出了一种新的交叉熵损失加权策略,解决了类分布不平衡的问题。我们的方法使用来自 7 个城市的数据集在 42 个适应场景中进行了评估,所有数据集均由高分辨率数字正射影像和高度数据组成。它在所有情况下都实现了正向迁移,平均而言,它提高了目标域的性能在整体准确度上。在来自 ISPRS 语义标签基准的 Vaihingen 和 Potsdam 数据集之间的适应场景中,我们的方法优于最近发表的方法 关于联合的平均交集。