Rethinking Mobile Block for Efficient Neural Models

Jiangning Zhang; Xiangtai Li; Jian Li; Liang Liu; Zhucun Xue; Boshen Zhang; Zhengkai Jiang; Tianxin Huang; Yabiao Wang; Chengjie Wang

効率的なニューラルモデルのためのモバイルブロックの再考

このホワイトペーパーでは、密な予測のための低パラメーターと FLOP を使用した効率的なモデルの設計に焦点を当てています。 CNN ベースの軽量メソッドは、何年にもわたる研究の結果、驚くべき結果を達成しましたが、トレードオフモデルの精度と制約のあるリソースには、さらに改善が必要です。この作業は、MobileNetv2 の効率的な Inverted Residual Block と ViT の効果的な Transformer の本質的な統一を再考し、Meta-Mobile Block の一般的な概念を帰納的に抽象化し、同じフレームワークを共有しながらパフォーマンスをモデル化するために特定のインスタンス化が非常に重要であると主張します。この現象に動機付けられて、モバイルアプリケーション用のシンプルで効率的な最新の Inverted Residual Mobile Block (iRMB) を推測します。これは、CNN のような効率を吸収して近距離の依存関係をモデル化し、Transformer のような動的モデリング機能を吸収して長距離の相互作用を学習します。さらに、高密度アプリケーション用の一連の iRMB のみに基づいて、ResNet のような 4 フェーズの効率的なモデル (EMO) を設計します。 ImageNet-1K、COCO2017、および ADE20K ベンチマークでの大規模な実験は、最先端の方法に対する当社の EMO の優位性を実証しています。モデルの精度と効率をうまくトレードオフしながら、SoTA CNN/Transformer ベースのモデル。コード: https://github.com/zhangzjn/EMO

This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern Inverted Residual Mobile Block (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase Efficient MOdel (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, e.g. , our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass SoTA CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well. Code: https://github.com/zhangzjn/EMO

updated: Tue Jan 10 2023 08:34:50 GMT+0000 (UTC)

published: Tue Jan 03 2023 15:11:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト