Deep Convolutional Tables: Deep Learning without Convolutions

Shay Dekel; Yosi Keller; Aharon Bar-Hillel

深い畳み込みテーブル: 畳み込みのない深層学習

ドット積ニューロンを使用せず、代わりに畳み込みテーブル (CT) と呼ばれる投票テーブルの階層に依存するディープネットワークの新しい定式化を提案し、高速化された CPU ベースの推論を可能にします。畳み込み層は、現代の深層学習技術で最も時間のかかるボトルネックであり、モノのインターネットや CPU ベースのデバイスでの使用が大幅に制限されています。提案された CT は、各画像位置で fern 操作を実行します。位置環境をバイナリインデックスにエンコードし、インデックスを使用して、テーブルから目的のローカル出力を取得します。複数のテーブルの結果を組み合わせて、最終的な出力を導き出します。 CT 変換の計算の複雑さは、パッチ (フィルター) のサイズとは無関係であり、チャネルの数に応じて適切に増加し、同等の畳み込みレイヤーよりも優れています。内積ニューロンよりも容量と計算の比率が高く、ディープ CT ネットワークはニューラルネットワークと同様の普遍的な近似特性を示すことが示されています。変換には離散インデックスの計算が含まれるため、CT 階層をトレーニングするためのソフト緩和と勾配ベースのアプローチを導き出します。ディープ CT ネットワークは、同様のアーキテクチャの CNN に匹敵する精度を持つことが実験的に示されています。低計算領域では、代替の効率的な CNN アーキテクチャよりも優れたエラーと速度のトレードオフが可能になります。

We propose a novel formulation of deep networks that do not use dot-product neurons and rely on a hierarchy of voting tables instead, denoted as Convolutional Tables (CT), to enable accelerated CPU-based inference. Convolutional layers are the most time-consuming bottleneck in contemporary deep learning techniques, severely limiting their use in Internet of Things and CPU-based devices. The proposed CT performs a fern operation at each image location: it encodes the location environment into a binary index and uses the index to retrieve the desired local output from a table. The results of multiple tables are combined to derive the final output. The computational complexity of a CT transformation is independent of the patch (filter) size and grows gracefully with the number of channels, outperforming comparable convolutional layers. It is shown to have a better capacity:compute ratio than dot-product neurons, and that deep CT networks exhibit a universal approximation property similar to neural networks. As the transformation involves computing discrete indices, we derive a soft relaxation and gradient-based approach for training the CT hierarchy. Deep CT networks have been experimentally shown to have accuracy comparable to that of CNNs of similar architectures. In the low compute regime, they enable an error:speed trade-off superior to alternative efficient CNN architectures.

updated: Sun Apr 23 2023 17:49:21 GMT+0000 (UTC)

published: Sun Apr 23 2023 17:49:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト