CSRNet: Cascaded Selective Resolution Network for Real-time Semantic Segmentation

Jingjing Xiong; Lai-Man Po; Wing-Yin Yu; Chang Zhou; Pengfei Xian; Weifeng Ou

CSRNet: リアルタイムセマンティックセグメンテーションのためのカスケード選択的解決ネットワーク

リアルタイムセグメンテーションセグメンテーションは、自律走行車、ロボット工学などの多くの実用的なアプリケーションで需要が高まっているため、かなりの注目を集めています。既存のリアルタイムセグメンテーションアプローチは、多くの場合、特徴融合を利用してセグメンテーションの精度を向上させます。ただし、彼らはさまざまな解像度での特徴情報を完全に考慮することができず、ネットワークの受容野は比較的制限されているため、パフォーマンスが低下します。この問題に取り組むために、私たちは軽いカスケード選択的解決ネットワーク (CSRNet) を提案し、複数のコンテキスト情報の埋め込みと強化された機能集約によってリアルタイムセグメンテーションのパフォーマンスを向上させます。提案するネットワークは、3 段階のセグメンテーションシステムを構築し、低解像度から高解像度までの特徴情報を統合し、段階的に特徴の細分化を実現します。 CSRNet には、Shorted Pyramid Fusion Module (SPFM) と Selective Resolution Module (SRM) という 2 つの重要なモジュールが含まれています。 SPFM は、グローバルコンテキスト情報を組み込み、各段階で受容野を大幅に拡大する計算効率の高いモジュールです。 SRM は、複数解像度の特徴マップをさまざまな受容野と融合するように設計されており、特徴マップ全体にソフトチャネルアテンションを割り当て、マルチスケールオブジェクトによって引き起こされる問題を解決するのに役立ちます。 2 つのよく知られたデータセットの包括的な実験は、提案された CSRNet がリアルタイムセグメンテーションのパフォーマンスを効果的に改善することを示しています。

Real-time semantic segmentation has received considerable attention due to growing demands in many practical applications, such as autonomous vehicles, robotics, etc. Existing real-time segmentation approaches often utilize feature fusion to improve segmentation accuracy. However, they fail to fully consider the feature information at different resolutions and the receptive fields of the networks are relatively limited, thereby compromising the performance. To tackle this problem, we propose a light Cascaded Selective Resolution Network (CSRNet) to improve the performance of real-time segmentation through multiple context information embedding and enhanced feature aggregation. The proposed network builds a three-stage segmentation system, which integrates feature information from low resolution to high resolution and achieves feature refinement progressively. CSRNet contains two critical modules: the Shorted Pyramid Fusion Module (SPFM) and the Selective Resolution Module (SRM). The SPFM is a computationally efficient module to incorporate the global context information and significantly enlarge the receptive field at each stage. The SRM is designed to fuse multi-resolution feature maps with various receptive fields, which assigns soft channel attentions across the feature maps and helps to remedy the problem caused by multi-scale objects. Comprehensive experiments on two well-known datasets demonstrate that the proposed CSRNet effectively improves the performance for real-time segmentation.

updated: Tue Jun 08 2021 14:22:09 GMT+0000 (UTC)

published: Tue Jun 08 2021 14:22:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト