MC-MLP:Multiple Coordinate Frames in all-MLP Architecture for Vision

Zhimin Zhu; Jianguo Zhao; Tong Mu; Yuliang Yang; Mengyu Zhu

MC-MLP:ビジョン向けオール MLP アーキテクチャの複数座標フレーム

深層学習では、多層パーセプトロン (MLP) が再び研究者の注目を集めています。このホワイトペーパーでは、MC-MLP を紹介します。MC-MLP は、一連の完全に接続された (FC) レイヤーで構成されるコンピュータービジョン用の一般的な MLP のようなバックボーンです。 MC-MLP では、同じ意味情報でも、特徴の座標フレームによって学習の難易度が異なることを提案します。これに対処するために、フィーチャの座標フレームを変更するのと同じように、フィーチャ情報に対して直交変換を実行します。この設計により、MC-MLP は多座標フレーム受容野と、異なる座標フレーム間で情報を学習する機能を備えています。実験では、MC-MLP が画像分類タスクでほとんどの MLP よりも優れており、同じパラメーターレベルでより優れたパフォーマンスを達成することが実証されています。コードは https://github.com/ZZM11/MC-MLP で入手できます。

In deep learning, Multi-Layer Perceptrons (MLPs) have once again garnered attention from researchers. This paper introduces MC-MLP, a general MLP-like backbone for computer vision that is composed of a series of fully-connected (FC) layers. In MC-MLP, we propose that the same semantic information has varying levels of difficulty in learning, depending on the coordinate frame of features. To address this, we perform an orthogonal transform on the feature information, equivalent to changing the coordinate frame of features. Through this design, MC-MLP is equipped with multi-coordinate frame receptive fields and the ability to learn information across different coordinate frames. Experiments demonstrate that MC-MLP outperforms most MLPs in image classification tasks, achieving better performance at the same parameter level. The code will be available at: https://github.com/ZZM11/MC-MLP.

updated: Sat Apr 08 2023 05:23:25 GMT+0000 (UTC)

published: Sat Apr 08 2023 05:23:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト