SFNet: Faster and Accurate Semantic Segmentation via Semantic Flow

Xiangtai Li; Jiangning Zhang; Yibo Yang; Guangliang Cheng; Kuiyuan Yang; Yunhai Tong; Dacheng Tao

SFNet: セマンティックフローによるより高速かつ正確なセマンティックセグメンテーション

このペーパーでは、より高速かつ正確なセマンティックセグメンテーションのための効果的な方法を探索することに焦点を当てます。パフォーマンスを向上させるための一般的な方法は、強力なセマンティック表現を備えた高解像度の特徴マップを取得することです。 Atrous コンボリューションとフィーチャピラミッドフュージョンの 2 つの戦略が広く使用されていますが、どちらも計算量が多いか非効率的です。隣接するビデオフレーム間のモーション調整のためのオプティカルフローからインスピレーションを得て、隣接するレベルの特徴マップ間のセマンティックフローを学習し、高レベルの特徴を高解像度の特徴に効果的かつ効率的にブロードキャストするフローアライメントモジュール (FAM) を提案します。さらに、当社の FAM を標準機能ピラミッド構造に統合すると、ResNet-18 や DFNet などの軽量バックボーンネットワーク上でも、他のリアルタイム方式よりも優れたパフォーマンスを発揮します。次に、推論手順をさらに高速化するために、高解像度の特徴マップと低解像度の特徴マップを直接調整するための新しいゲートデュアルフローアライメントモジュールも紹介します。改良版ネットワークを SFNet-Lite と呼びます。いくつかの困難なデータセットに対して広範な実験が実施され、その結果は SFNet と SFNet-Lite の両方の有効性を示しています。特に、Cityscapes テストセットを使用する場合、SFNet-Lite シリーズは、ResNet-18 バックボーンを使用して 60 FPS で実行中に 80.1 mIoU を達成し、RTX-3090 の STDC バックボーンを使用して 120 FPS で実行中に 78.8 mIoU を達成します。さらに、4 つの困難な運転データセットを 1 つの大きなデータセットに統合し、これを統合運転セグメンテーション (UDS) データセットと名付けました。さまざまなドメインとスタイルの情報が含まれています。 UDS 上のいくつかの代表的な作品をベンチマークします。 SFNet と SFNet-Lite はどちらも、UDS 上で最高の速度と精度のトレードオフを実現しており、このような困難な設定において強力なベースラインとして機能します。コードとモデルは https://github.com/lxtGH/SFSegNets で公開されています。

In this paper, we focus on exploring effective methods for faster and accurate semantic segmentation. A common practice to improve the performance is to attain high-resolution feature maps with strong semantic representation. Two strategies are widely used: atrous convolutions and feature pyramid fusion, while both are either computationally intensive or ineffective. Inspired by the Optical Flow for motion alignment between adjacent video frames, we propose a Flow Alignment Module (FAM) to learn Semantic Flow between feature maps of adjacent levels and broadcast high-level features to high-resolution features effectively and efficiently. Furthermore, integrating our FAM to a standard feature pyramid structure exhibits superior performance over other real-time methods, even on lightweight backbone networks, such as ResNet-18 and DFNet. Then to further speed up the inference procedure, we also present a novel Gated Dual Flow Alignment Module to directly align high-resolution feature maps and low-resolution feature maps where we term the improved version network as SFNet-Lite. Extensive experiments are conducted on several challenging datasets, where results show the effectiveness of both SFNet and SFNet-Lite. In particular, when using Cityscapes test set, the SFNet-Lite series achieve 80.1 mIoU while running at 60 FPS using ResNet-18 backbone and 78.8 mIoU while running at 120 FPS using STDC backbone on RTX-3090. Moreover, we unify four challenging driving datasets into one large dataset, which we named Unified Driving Segmentation (UDS) dataset. It contains diverse domain and style information. We benchmark several representative works on UDS. Both SFNet and SFNet-Lite still achieve the best speed and accuracy trade-off on UDS, which serves as a strong baseline in such a challenging setting. The code and models are publicly available at https://github.com/lxtGH/SFSegNets.

updated: Fri Aug 04 2023 09:00:27 GMT+0000 (UTC)

published: Sun Jul 10 2022 08:25:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト