Efficient Multi-order Gated Aggregation Network

Siyuan Li; Zedong Wang; Zicheng Liu; Cheng Tan; Haitao Lin; Di Wu; Zhiyuan Chen; Jiangbin Zheng; Stan Z. Li

効率的な多次ゲーテッドアグリゲーションネットワーク

ビジョントランスフォーマー (ViT) の最近の成功以来、ViT スタイルのアーキテクチャへの探求が ConvNet の復活を引き起こしています。この作業では、ゲーム理論に基づくさまざまなスケールのコンテキストに関する変数間の相互作用効果を反映する、多次ゲーム理論的相互作用の斬新な見方から、最新の ConvNets の表現能力を探ります。最新の ConvNet フレームワーク内で、2 つの機能ミキサーを概念的に単純でありながら効果的な深さ方向の畳み込みで調整して、それぞれ空間空間とチャネル空間で中次情報を容易にします。この観点から、MogaNet と呼ばれる純粋な ConvNet アーキテクチャの新しいファミリが提案されています。これは、優れたスケーラビリティを示し、ImageNet のパラメータをより効率的に使用する最先端のモデルと、以下を含むさまざまな典型的なビジョンベンチマークとの間で競争力のある結果を達成します。 COCO オブジェクト検出、ADE20K セマンティックセグメンテーション、2D\&3D 人間の姿勢推定、ビデオ予測。通常、MogaNet は、ImageNet の 5.2M および 181M パラメータで 80.0% および 87.8% のトップ 1 精度に達し、59% の FLOP および 17M パラメータを節約しながら、ParC-Net-S および ConvNeXt-L よりも優れています。ソースコードは https://github.com/Westlake-AI/MogaNet で入手できます。

Since the recent success of Vision Transformers (ViTs), explorations toward ViT-style architectures have triggered the resurgence of ConvNets. In this work, we explore the representation ability of modern ConvNets from a novel view of multi-order game-theoretic interaction, which reflects inter-variable interaction effects w.r.t.~contexts of different scales based on game theory. Within the modern ConvNet framework, we tailor the two feature mixers with conceptually simple yet effective depthwise convolutions to facilitate middle-order information across spatial and channel spaces respectively. In this light, a new family of pure ConvNet architecture, dubbed MogaNet, is proposed, which shows excellent scalability and attains competitive results among state-of-the-art models with more efficient use of parameters on ImageNet and multifarious typical vision benchmarks, including COCO object detection, ADE20K semantic segmentation, 2D\&3D human pose estimation, and video prediction. Typically, MogaNet hits 80.0% and 87.8% top-1 accuracy with 5.2M and 181M parameters on ImageNet, outperforming ParC-Net-S and ConvNeXt-L while saving 59% FLOPs and 17M parameters. The source code is available at https://github.com/Westlake-AI/MogaNet.

updated: Mon Mar 20 2023 01:44:37 GMT+0000 (UTC)

published: Mon Nov 07 2022 04:31:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト