Incomplete Multimodal Learning for Remote Sensing Data Fusion

Yuxing Chen; Maofan Zhao; Lorenzo Bruzzone

リモートセンシングデータフュージョンのための不完全なマルチモーダル学習

マルチモーダル信号をセルフアテンション操作によって接続するメカニズムは、リモートセンシングデータフュージョンタスクにおけるマルチモーダル Transformer ネットワークの成功の重要な要素です。ただし、従来のアプローチでは、トレーニングと推論の両方ですべてのモダリティへのアクセスが想定されているため、ダウンストリームアプリケーションでモーダルの不完全な入力を処理する際に深刻なパフォーマンス低下につながる可能性があります。この制限に対処するために、提案されたアプローチでは、リモートセンシングデータフュージョンのコンテキストで不完全なマルチモーダル学習の新しいモデルを導入します。このアプローチは、教師ありおよび自己教師ありの事前トレーニングパラダイムの両方で使用でき、追加の学習済み融合トークンを Bi-LSTM 注意およびマスクされた自己注意メカニズムと組み合わせて活用し、マルチモーダルシグナルを収集します。提案されたアプローチでは、ネットワークトレーニングの入力としてランダムなモダリティの組み合わせを可能にしながら、事前トレーニングでの融合を容易にするために再構成と対照的な損失を採用しています。私たちのアプローチは、推論中に不完全な入力を処理するときに、インスタンス/セマンティックセグメンテーションの構築や土地被覆マッピングタスクなどのタスクのために、2 つのマルチモーダルデータセットで最先端のパフォーマンスを提供します。

The mechanism of connecting multimodal signals through self-attention operation is a key factor in the success of multimodal Transformer networks in remote sensing data fusion tasks. However, traditional approaches assume access to all modalities during both training and inference, which can lead to severe degradation when dealing with modal-incomplete inputs in downstream applications. To address this limitation, our proposed approach introduces a novel model for incomplete multimodal learning in the context of remote sensing data fusion. This approach can be used in both supervised and self-supervised pretraining paradigms and leverages the additional learned fusion tokens in combination with Bi-LSTM attention and masked self-attention mechanisms to collect multimodal signals. The proposed approach employs reconstruction and contrastive loss to facilitate fusion in pre-training while allowing for random modality combinations as inputs in network training. Our approach delivers state-of-the-art performance on two multimodal datasets for tasks such as building instance / semantic segmentation and land-cover mapping tasks when dealing with incomplete inputs during inference.

updated: Sat Apr 22 2023 12:16:52 GMT+0000 (UTC)

published: Sat Apr 22 2023 12:16:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト