GFM: Building Geospatial Foundation Models via Continual Pretraining

Matias Mendieta; Boran Han; Xingjian Shi; Yi Zhu; Chen Chen

GFM: 継続的な事前トレーニングによる地理空間基盤モデルの構築

地理空間技術は、農業、都市計画、災害対応など、幅広い用途のために、私たちの世界でますます不可欠になっています。これらの地理空間タスクでのディープラーニングモデルの適用性とパフォーマンスを向上させるために、このドメインの基礎モデルの調査がさまざまな作業で開始されています。研究者は、このようなモデルを地理空間アプリケーションに導入するための 2 つの著名なアプローチを検討してきましたが、どちらもパフォーマンス上のメリットが限られているか、トレーニングコストが法外に高いという点で欠点があります。したがって、この作業では、最小限のリソースコストと二酸化炭素の影響で非常に効果的な地理空間基盤モデルを構築するための新しいパラダイムを提案します。最初に、複数のソースからコンパクトでありながら多様なデータセットを構築して、機能の多様性を促進します。これを GeoPile と呼びます。次に、大規模な ImageNet-22k モデルからの継続的な事前トレーニングの可能性を調査し、多目的の継続的な事前トレーニングパラダイムを提案します。これは、ImageNet の強力な表現を活用すると同時に、ドメイン内の貴重な機能を学習する自由を提供します。私たちのアプローチは、変更検出、分類、マルチラベル分類、セマンティックセグメンテーション、超解像などのさまざまなタスクをカバーする 7 つのダウンストリームデータセットの広範な評価において、以前の最先端の地理空間事前トレーニング方法よりも優れています。

Geospatial technologies are becoming increasingly essential in our world for a wide range of applications, including agriculture, urban planning, and disaster response. To help improve the applicability and performance of deep learning models on these geospatial tasks, various works have begun investigating foundation models for this domain. Researchers have explored two prominent approaches for introducing such models in geospatial applications, but both have drawbacks in terms of limited performance benefit or prohibitive training cost. Therefore, in this work, we propose a novel paradigm for building highly effective geospatial foundation models with minimal resource cost and carbon impact. We first construct a compact yet diverse dataset from multiple sources to promote feature diversity, which we term GeoPile. Then, we investigate the potential of continual pretraining from large-scale ImageNet-22k models and propose a multi-objective continual pretraining paradigm, which leverages the strong representations of ImageNet while simultaneously providing the freedom to learn valuable in-domain features. Our approach outperforms previous state-of-the-art geospatial pretraining methods in an extensive evaluation on seven downstream datasets covering various tasks such as change detection, classification, multi-label classification, semantic segmentation, and super-resolution.

updated: Thu Mar 30 2023 21:26:37 GMT+0000 (UTC)

published: Thu Feb 09 2023 07:39:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト