HDAM: Heuristic Difference Attention Module for Convolutional Neural Networks

Yu Xue; Ziming Yuan

HDAM：畳み込みニューラルネットワーク用のヒューリスティック差分注意モジュール

注意メカニズムは、畳み込みニューラルネットワークを強化するための最も重要な先験的知識の1つです。ほとんどの注意メカニズムは畳み込み層にバインドされており、ローカルまたはグローバルのコンテキスト情報を使用して入力を再調整します。これは、人気のある注意戦略の設計方法です。グローバルコンテキスト情報は、ネットワークが全体的な分布を検討するのに役立ちますが、ローカルコンテキスト情報はより一般的です。コンテキスト情報により、ネットワークは特定の受容野の平均値または最大値に注意を向けます。最も注意のメカニズムとは異なり、この記事では、ヒューリスティックな差異注意モジュールであるHDAMを使用した新しい注意メカニズムを提案します。 HDAMの入力の再調整は、平均値と最大値ではなく、ローカルとグローバルのコンテキスト情報の違いに基づいています。同時に、異なるレイヤーがより適切な局所受容野サイズを持ち、局所受容野設計の柔軟性を高めるために、遺伝的アルゴリズムを使用して局所受容野をヒューリスティックに生成します。まず、HDAMは、グローバルおよびローカルの受容野の平均値を対応するコンテキスト情報として抽出します。次に、グローバルコンテキスト情報とローカルコンテキスト情報の差が計算されます。最後に、HDAMはこの違いを使用して入力を再調整します。さらに、遺伝的アルゴリズムのヒューリスティック機能を使用して、各層の局所受容野サイズを検索します。 CIFAR-10およびCIFAR-100での実験では、HDAMが他の注意メカニズムよりも少ないパラメーターを使用して、より高い精度を達成できることが示されています。 PythonライブラリであるPytorchを使用してHDAMを実装すると、コードとモデルが公開されます。

The attention mechanism is one of the most important priori knowledge to enhance convolutional neural networks. Most attention mechanisms are bound to the convolutional layer and use local or global contextual information to recalibrate the input. This is a popular attention strategy design method. Global contextual information helps the network to consider the overall distribution, while local contextual information is more general. The contextual information makes the network pay attention to the mean or maximum value of a particular receptive field. Different from the most attention mechanism, this article proposes a novel attention mechanism with the heuristic difference attention module, HDAM. HDAM's input recalibration is based on the difference between the local and global contextual information instead of the mean and maximum values. At the same time, to make different layers have a more suitable local receptive field size and increase the exibility of the local receptive field design, we use genetic algorithm to heuristically produce local receptive fields. First, HDAM extracts the mean value of the global and local receptive fields as the corresponding contextual information. Then the difference between the global and local contextual information is calculated. Finally HDAM uses this difference to recalibrate the input. In addition, we use the heuristic ability of genetic algorithm to search for the local receptive field size of each layer. Our experiments on CIFAR-10 and CIFAR-100 show that HDAM can use fewer parameters than other attention mechanisms to achieve higher accuracy. We implement HDAM with the Python library, Pytorch, and the code and models will be publicly available.

updated: Sat Feb 19 2022 09:19:01 GMT+0000 (UTC)

published: Sat Feb 19 2022 09:19:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト