Do we still need ImageNet pre-training in remote sensing scene classification?

Vladimir Risojević; Vladan Stojnić

リモートセンシングシーン分類でImageNetの事前トレーニングが必要ですか？

ラベル付けされたデータが不足しているため、ImageNetで事前にトレーニングされた監視対象モデルを使用することは、リモートセンシングシーン分類の事実上の標準です。最近、より大きな高解像度リモートセンシング（HRRS）画像データセットの利用可能性と自己監視学習の進歩により、リモートセンシングシーンの分類に監視付きImageNet事前トレーニングが依然として必要であり、HRRSでの事前トレーニングを監視するかどうかという疑問が生じています。画像データセットまたはImageNetでの自己監視事前トレーニングにより、ターゲットのリモートセンシングシーン分類タスクでより良い結果が得られます。これらの質問に答えるために、このペーパーでは、モデルを最初からトレーニングし、いくつかのHRRS画像データセットで監視および自己監視のImageNetモデルを微調整します。また、学習した表現のHRRSシーン分類タスクへの転送可能性を評価し、HRRS事前トレーニングのパフォーマンスが自己監視事前トレーニングと同様か、わずかに低い一方で、自己監視事前トレーニングが監視対象よりも優れていることを示します。最後に、ImageNet事前トレーニングモデルを、ドメイン内HRRS画像を使用した第2ラウンドの事前トレーニング、つまりドメイン適応事前トレーニングと組み合わせて使用することを提案します。実験結果は、ドメイン適応型の事前トレーニングの結果が、HRRSシーン分類ベンチマークで最先端の結果を達成するモデルになることを示しています。ソースコードと事前トレーニング済みモデルは、https：//github.com/risojevicv/RSSC-transferで入手できます。

Due to the scarcity of labeled data, using supervised models pre-trained on ImageNet is a de facto standard in remote sensing scene classification. Recently, the availability of larger high resolution remote sensing (HRRS) image datasets and progress in self-supervised learning have brought up the questions of whether supervised ImageNet pre-training is still necessary for remote sensing scene classification and would supervised pre-training on HRRS image datasets or self-supervised pre-training on ImageNet achieve better results on target remote sensing scene classification tasks. To answer these questions, in this paper we both train models from scratch and fine-tune supervised and self-supervised ImageNet models on several HRRS image datasets. We also evaluate the transferability of learned representations to HRRS scene classification tasks and show that self-supervised pre-training outperforms the supervised one, while the performance of HRRS pre-training is similar to self-supervised pre-training or slightly lower. Finally, we propose using an ImageNet pre-trained model combined with a second round of pre-training using in-domain HRRS images, i.e. domain-adaptive pre-training. The experimental results show that domain-adaptive pre-training results in models that achieve state-of-the-art results on HRRS scene classification benchmarks. The source code and pre-trained models are available at https://github.com/risojevicv/RSSC-transfer.

updated: Wed May 25 2022 16:21:35 GMT+0000 (UTC)

published: Fri Nov 05 2021 18:30:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト