A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery

Aatif Jiwani; Shubhrakanti Ganguly; Chao Ding; Nan Zhou; David M. Chan

RGB衛星画像を使用した都市規模の建物のフットプリント抽出のためのセマンティックセグメンテーションネットワーク

都市部は世界のエネルギーの3分の2以上を消費し、世界のCO2排出量の70％以上を占めています。 IPCCの1.5Cの地球温暖化報告書で述べられているように、2050年までにカーボンニュートラルを達成するには、都市の形状を明確に理解する必要があります。衛星画像からの高品質の建物フットプリントの生成は、この予測プロセスを加速し、大規模な地方自治体の意思決定を強化することができます。ただし、以前のディープラーニングベースのアプローチは、スケールの不変性やフットプリントの欠陥などの結果的な問題に直面しています。これは、クラスごとの不均衡が常に存在するためです。さらに、ほとんどのアプローチでは、点群データ、建物の高さ情報、マルチバンド画像などの補足データが必要です。これらのデータは、可用性が限られており、作成するのが面倒です。この論文では、拡張Res-Netバックボーンを備えた修正DeeplabV3 +モジュールを提案して、3チャンネルRGB衛星画像のみから建物のフットプリントのマスクを生成します。さらに、モデルが偏ったクラス分布を説明し、誤検知のフットプリントを防ぐのに役立つように、目的関数にF-Betaメジャーを導入します。 F-Betaに加えて、指数関数的に重み付けされた境界損失を組み込み、クロスデータセットトレーニング戦略を使用して、予測の品質をさらに向上させます。その結果、3つの公開ベンチマークで最先端のパフォーマンスを達成し、RGBのみの方法で高品質の視覚的結果が得られ、衛星画像のスケール、解像度、都市密度にとらわれないことを実証しました。

Urban areas consume over two-thirds of the world's energy and account for more than 70 percent of global CO2 emissions. As stated in IPCC's Global Warming of 1.5C report, achieving carbon neutrality by 2050 requires a clear understanding of urban geometry. High-quality building footprint generation from satellite images can accelerate this predictive process and empower municipal decision-making at scale. However, previous Deep Learning-based approaches face consequential issues such as scale invariance and defective footprints, partly due to ever-present class-wise imbalance. Additionally, most approaches require supplemental data such as point cloud data, building height information, and multi-band imagery - which has limited availability and are tedious to produce. In this paper, we propose a modified DeeplabV3+ module with a Dilated Res-Net backbone to generate masks of building footprints from three-channel RGB satellite imagery only. Furthermore, we introduce an F-Beta measure in our objective function to help the model account for skewed class distributions and prevent false-positive footprints. In addition to F-Beta, we incorporate an exponentially weighted boundary loss and use a cross-dataset training strategy to further increase the quality of predictions. As a result, we achieve state-of-the-art performances across three public benchmarks and demonstrate that our RGB-only method produces higher quality visual results and is agnostic to the scale, resolution, and urban density of satellite imagery.

updated: Fri Nov 19 2021 04:11:44 GMT+0000 (UTC)

published: Fri Apr 02 2021 22:32:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト