Pushing the Limits of Capsule Networks

Prem Nair; Rohan Doshi; Stefan Keselj

カプセルネットワークの限界を押し上げる

畳み込みニューラルネットワークは、プーリングおよびその他のダウンスケーリング操作を使用して、特徴を検出するための並進不変性を維持しますが、それらのアーキテクチャでは、互いに対する特徴の位置の表現を明示的に維持しません。つまり、人間のように、同じオブジェクトの2つのインスタンスを異なる方向で同じように表現することはないため、それらをトレーニングするには、多くの場合、大規模なデータ拡張と非常に深いネットワークが必要になります。グーグルブレインのチームは最近、この問題を解決するためにニュースを出しました：カプセルネットワーク。通常のCNNは、特徴の存在を表すスカラー出力で機能しますが、CapsNetは、エンティティの存在を表すベクトル出力で機能します。 CapsNetのパフォーマンスと表現力をよりよく理解するために、さまざまな段階的な方法でCapsNetのストレステストを行いたいと考えています。大まかに言えば、私たちの調査の目標は、（1）MNISTに似ているが、特定の方法でより難しいデータセットでCapsNetをテストすること、および（2）CapsNetの内部埋め込みスペースとエラーの原因を調査することです。

Convolutional neural networks use pooling and other downscaling operations to maintain translational invariance for detection of features, but in their architecture they do not explicitly maintain a representation of the locations of the features relative to each other. This means they do not represent two instances of the same object in different orientations the same way, like humans do, and so training them often requires extensive data augmentation and exceedingly deep networks. A team at Google Brain recently made news with an attempt to fix this problem: Capsule Networks. While a normal CNN works with scalar outputs representing feature presence, a CapsNet works with vector outputs representing entity presence. We want to stress test CapsNet in various incremental ways to better understand their performance and expressiveness. In broad terms, the goals of our investigation are: (1) test CapsNets on datasets that are like MNIST but harder in a specific way, and (2) explore the internal embedding space and sources of error for CapsNets.

updated: Mon Mar 15 2021 00:30:34 GMT+0000 (UTC)

published: Mon Mar 15 2021 00:30:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト