当前位置: X-MOL 学术IEEE Geosci. Remote Sens. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SatViT: Pretraining Transformers for Earth Observation
IEEE Geoscience and Remote Sensing Letters ( IF 4.8 ) Pub Date : 2022-08-24 , DOI: 10.1109/lgrs.2022.3201489
Anthony Fuller 1 , Koreen Millard 2 , James R. Green 1
Affiliation  

Despite the enormous success of the “pretraining and fine-tuning” paradigm, widespread across machine learning, it has yet to pervade remote sensing (RS). To help rectify this, we pretrain a vision transformer (ViT) on 1.3 million satellite-derived RS images. We pretrain SatViT using a state-of-the-art (SOTA) self-supervised learning (SSL) algorithm called masked autoencoding (MAE), which learns general representations by reconstructing held-out image patches. Crucially, this approach does not require annotated data, allowing us to pretrain on unlabeled images acquired from Sentinel-1 and 2. After fine-tuning, SatViT outperforms SOTA ImageNet and RS-specific pretrained models on both of our downstream tasks. We further improve the overall accuracy (OA) (by 3.2% and 0.21%) by continuing to pretrain SatViT—still using MAE—on the unlabelled target datasets. Most importantly, we release our code, pretrained model weights, and tutorials aimed at helping researchers fine-tune our models ( https://github.com/antofuller/SatViT ).

中文翻译:

SatViT:为地球观测预训练变压器

尽管“预训练和微调”范式取得了巨大成功,广泛应用于机器学习,但它尚未普及到遥感 (RS)。为了帮助纠正这个问题,我们在 130 万张卫星衍生的 RS 图像上预训练了一个视觉转换器 (ViT)。我们使用最先进的 (SOTA) 自监督学习 (SSL) 算法对 SatViT 进行预训练,该算法称为掩码自动编码 (MAE),该算法通过重建保留的图像块来学习一般表示。至关重要的是,这种方法不需要注释数据,使我们能够对从 Sentinel-1 和 2 获取的未标记图像进行预训练。经过微调后,SatViT 在我们的两个下游任务中都优于 SOTA ImageNet 和 RS 特定的预训练模型。我们进一步提高了整体准确度(OA)(分别提高了 3.2% 和 0. 21%)通过在未标记的目标数据集上继续预训练 SatViT(仍然使用 MAE)。最重要的是,我们发布了我们的代码、预训练的模型权重和旨在帮助研究人员微调我们的模型的教程( https://github.com/antofuller/SatViT )。
更新日期:2022-08-24
down
wechat
bug