SFNet: Faster, Accurate, and Domain Agnostic Semantic Segmentation via Semantic Flow

Xiangtai Li; Jiangning Zhang; Yibo Yang; Guangliang Cheng; Kuiyuan Yang; Yunhai Tong; Dacheng Tao

SFNet：セマンティックフローを介した、より高速で正確なドメインに依存しないセマンティックセグメンテーション

この論文では、より速く、正確で、ドメインにとらわれないセマンティックセグメンテーションのための効果的な方法を探求することに焦点を当てています。隣接するビデオフレーム間のモーションアラインメントのためのオプティカルフローに触発されて、隣接するレベルのフィーチャマップ間のセマンティックフローを学習し、高レベルの機能を高解像度の機能に効果的かつ効率的にブロードキャストするフローアラインメントモジュール（FAM）を提案します。さらに、FAMを共通機能のピラミッド構造に統合すると、ResNet-18やDFNetなどの軽量バックボーンネットワークでも、他のリアルタイム方式よりも優れたパフォーマンスを発揮します。次に、推論手順をさらに高速化するために、新しいゲート付きデュアルフローアライメントモジュールを提示して、高解像度機能マップと低解像度機能マップを直接位置合わせします。ここでは、改良バージョンネットワークをSFNet-Liteと呼びます。いくつかのやりがいのあるデータセットで広範な実験が行われ、結果はSFNetとSFNet-Liteの両方の有効性を示しています。特に、提案されたSFNet-Liteシリーズは、ResNet-18バックボーンを使用して60FPSで実行しているときに80.1mIoUを達成し、RTX-3090でSTDCバックボーンを使用して120FPSで実行しているときに78.8mIoUを達成します。さらに、4つの挑戦的な運転データセット（Cityscapes、Mapillary、IDD、BDD）を1つの大きなデータセットに統合し、Unified Driving Segmentation（UDS）データセットと名付けました。さまざまなドメインおよびスタイル情報が含まれています。 UDSに関するいくつかの代表的な作業のベンチマークを行います。 SFNetとSFNet-Liteはどちらも、UDSで最高の速度と精度のトレードオフを実現します。これは、このような新しい困難な設定で強力なベースラインとして機能します。すべてのコードとモデルは、https：//github.com/lxtGH/SFSegNetsで公開されています。

In this paper, we focus on exploring effective methods for faster, accurate, and domain agnostic semantic segmentation. Inspired by the Optical Flow for motion alignment between adjacent video frames, we propose a Flow Alignment Module (FAM) to learn Semantic Flow between feature maps of adjacent levels, and broadcast high-level features to high resolution features effectively and efficiently. Furthermore, integrating our FAM to a common feature pyramid structure exhibits superior performance over other real-time methods even on light-weight backbone networks, such as ResNet-18 and DFNet. Then to further speed up the inference procedure, we also present a novel Gated Dual Flow Alignment Module to directly align high resolution feature maps and low resolution feature maps where we term improved version network as SFNet-Lite. Extensive experiments are conducted on several challenging datasets, where results show the effectiveness of both SFNet and SFNet-Lite. In particular, the proposed SFNet-Lite series achieve 80.1 mIoU while running at 60 FPS using ResNet-18 backbone and 78.8 mIoU while running at 120 FPS using STDC backbone on RTX-3090. Moreover, we unify four challenging driving datasets (i.e., Cityscapes, Mapillary, IDD and BDD) into one large dataset, which we named Unified Driving Segmentation (UDS) dataset. It contains diverse domain and style information. We benchmark several representative works on UDS. Both SFNet and SFNet-Lite still achieve the best speed and accuracy trade-off on UDS which serves as a strong baseline in such a new challenging setting. All the code and models are publicly available at https://github.com/lxtGH/SFSegNets.

updated: Sun Jul 10 2022 08:25:47 GMT+0000 (UTC)

published: Sun Jul 10 2022 08:25:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト