Inferring 3D change detection from bitemporal optical images

Valerio Marsocci; Virginia Coletta; Roberta Ravanelli; Simone Scardapane; Mattia Crespi

バイテンポラル光学画像からの3D変化検出の推測

変化の検出は、リモートセンシング（RS）で最も活発な研究分野の1つです。最近開発された変化検出方法のほとんどは、深層学習（DL）アルゴリズムに基づいています。この種のアルゴリズムは、一般に2次元（2D）変更マップの生成に焦点を合わせているため、土地利用/土地被覆（LULC）の平面的な変更のみを識別し、対応する標高の変更に関する情報を考慮したり返したりすることはありません。私たちの仕事はさらに一歩進んで、2Dと3DのCDタスクを同時に解決できる2つの新しいネットワークと、このマルチタスク用に正確に設計された新しい無料のデータセットである3DCDデータセットを提案します。特に、この作業の目的は、標高（3D）CDマップを標準の2D CDマップと一緒に、一対のバイテンポラル光学画像からのみ自動的に推測できるDLアルゴリズムの開発の基礎を築くことです。。前述のタスクを実行するために提案されたアーキテクチャは、トランスベースのネットワークであるMultiTask Bitemporal Images Transformer（MTBIT）と、深い畳み込みネットワークであるSiamese ResUNet（SUNet）で構成されています。特に、MTBITは、セマンティックトークナイザーに基づくトランスベースのアーキテクチャです。代わりに、SUNetは、シャムエンコーダーで、接続と残りのレイヤーをスキップして豊富な機能を学習し、提案されたタスクを効率的に解決できるようにします。したがって、これらのモデルは、推論ステップ中に標高データに直接依存する必要なしに、異なる時点で撮影された2つの光学画像から3DCDマップを取得できます。新規の3DCDデータセットで得られた有望な結果が示されています。コードと3DCDデータセットは、https：//sites.google.com/uniroma1.it/3dchangedetection/home-pageで入手できます。

Change detection is one of the most active research areas in Remote Sensing (RS). Most of the recently developed change detection methods are based on deep learning (DL) algorithms. This kind of algorithms is generally focused on generating two-dimensional (2D) change maps, thus only identifying planimetric changes in land use/land cover (LULC) and not considering nor returning any information on the corresponding elevation changes. Our work goes one step further, proposing two novel networks, able to solve simultaneously the 2D and 3D CD tasks, and the 3DCD dataset, a novel and freely available dataset precisely designed for this multitask. Particularly, the aim of this work is to lay the foundations for the development of DL algorithms able to automatically infer an elevation (3D) CD map -- together with a standard 2D CD map --, starting only from a pair of bitemporal optical images. The proposed architectures, to perform the task described before, consist of a transformer-based network, the MultiTask Bitemporal Images Transformer (MTBIT), and a deep convolutional network, the Siamese ResUNet (SUNet). Particularly, MTBIT is a transformer-based architecture, based on a semantic tokenizer. SUNet instead combines, in a siamese encoder, skip connections and residual layers to learn rich features, capable to solve efficiently the proposed task. These models are, thus, able to obtain 3D CD maps from two optical images taken at different time instants, without the need to rely directly on elevation data during the inference step. Encouraging results, obtained on the novel 3DCD dataset, are shown. The code and the 3DCD dataset are available at https://sites.google.com/uniroma1.it/3dchangedetection/home-page.

updated: Mon Jan 16 2023 11:43:36 GMT+0000 (UTC)

published: Tue May 31 2022 15:53:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト