Rethinking Lightweight Salient Object Detection via Network Depth-Width Tradeoff

Jia Li; Shengye Qiao; Zhirui Zhao; Chenxi Xie; Xiaowu Chen; Changqun Xia

ネットワークの深さと幅のトレードオフによる軽量の顕著なオブジェクト検出の再考

既存の顕著なオブジェクト検出方法は、パフォーマンスを向上させるために、より深くより広いネットワークを採用することが多く、その結果、計算負荷が高くなり、推論速度が遅くなります。これは、効率と精度の間で好ましいバランスを達成するために、顕著性の検出を再考するよう促します。この目的のために、満足のいく競争上の精度を維持しながら、軽量のフレームワークを設計します。具体的には、U 字型構造を 3 つの相補的な分岐に分離することにより、新しい三極デコーダーフレームワークを提案します。これらの分岐は、意味的コンテキストの希薄化、空間構造の損失、および境界の詳細の欠如にそれぞれ対処するように考案されています。 3 つのブランチの融合に伴い、粗いセグメンテーションの結果は、構造の詳細と境界の品質で徐々に洗練されます。追加の学習可能なパラメータを追加することなく、スケール適応プーリングモジュールをさらに提案して、マルチスケールの受容フィールドを取得します。特に、この枠組みを継承することを前提として、ネットワークの深さと幅のトレードオフを通じて、精度、パラメータ、および速度の関係を再考します。これらの洞察に満ちた考慮事項により、軽量SODの最大の可能性を探るために、より浅いモデルとより狭いモデルを包括的に設計します.当社のモデルは、さまざまなアプリケーション環境を対象としています。1) リソースに制約のあるデバイス向けの小型バージョン CTD-S (1.7M、125FPS)、2) 速度が要求されるシナリオ向けの高速バージョン CTD-M (12.6M、158FPS)、3)高性能プラットフォーム向けの標準バージョン CTD-L (26.5M、84FPS)。広範な実験により、5 つのベンチマークで効率と精度のバランスを改善する方法の優位性が検証されました。

Existing salient object detection methods often adopt deeper and wider networks for better performance, resulting in heavy computational burden and slow inference speed. This inspires us to rethink saliency detection to achieve a favorable balance between efficiency and accuracy. To this end, we design a lightweight framework while maintaining satisfying competitive accuracy. Specifically, we propose a novel trilateral decoder framework by decoupling the U-shape structure into three complementary branches, which are devised to confront the dilution of semantic context, loss of spatial structure and absence of boundary detail, respectively. Along with the fusion of three branches, the coarse segmentation results are gradually refined in structure details and boundary quality. Without adding additional learnable parameters, we further propose Scale-Adaptive Pooling Module to obtain multi-scale receptive filed. In particular, on the premise of inheriting this framework, we rethink the relationship among accuracy, parameters and speed via network depth-width tradeoff. With these insightful considerations, we comprehensively design shallower and narrower models to explore the maximum potential of lightweight SOD. Our models are purposed for different application environments: 1) a tiny version CTD-S (1.7M, 125FPS) for resource constrained devices, 2) a fast version CTD-M (12.6M, 158FPS) for speed-demanding scenarios, 3) a standard version CTD-L (26.5M, 84FPS) for high-performance platforms. Extensive experiments validate the superiority of our method, which achieves better efficiency-accuracy balance across five benchmarks.

updated: Tue Jan 17 2023 03:43:25 GMT+0000 (UTC)

published: Tue Jan 17 2023 03:43:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト