BuildFormer: Automatic building extraction with vision transformer

Libo Wang; Yuechi Yang; Rui Li

BuildFormer：ビジョントランスフォーマーによる自動建物抽出

高解像度のリモートセンシング画像からの建物の抽出は、都市計画、人口統計、経済評価、災害管理など、多くの地理空間アプリケーションで重要な役割を果たします。深層学習技術の進歩に伴い、深層畳み込みニューラルネットワーク（DCNN）は、長年にわたって自動建物抽出タスクを支配してきました。ただし、DCNNのローカルプロパティはグローバル情報の抽出を制限し、建物のインスタンスを認識するネットワークの機能を弱めます。最近、Transformerはコンピュータービジョンドメインのホットトピックを構成し、画像分類、セマンティックセグメンテーション、オブジェクト検出などの基本的なビジョンタスクで最先端のパフォーマンスを実現します。これに触発されて、この論文では、高解像度のリモートセンシング画像から建物を抽出するための新しいトランスベースのネットワーク、すなわちBuildFormerを提案します。 ResNetと比較して、提案された方法は、WHU建物データセットのmIoUで2％の改善を達成します。

Building extraction from fine-resolution remote sensing images plays a vital role in numerous geospatial applications, such as urban planning, population statistic, economic assessment and disaster management. With the advancement of deep learning technology, deep convolutional neural networks (DCNNs) have dominated the automatic building extraction task for many years. However, the local property of DCNNs limits the extraction of global information, weakening the ability of the network for recognizing the building instance. Recently, the Transformer comprises a hot topic in the computer vision domain and achieves state-of-the-art performance in fundamental vision tasks, such as image classification, semantic segmentation and object detection. Inspired by this, in this paper, we propose a novel transformer-based network for extracting buildings from fine-resolution remote sensing images, namely BuildFormer. In Comparision with the ResNet, the proposed method achieves an improvement of 2% in mIoU on the WHU building dataset.

updated: Mon Nov 29 2021 11:23:52 GMT+0000 (UTC)

published: Mon Nov 29 2021 11:23:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト