Revisiting Sparse Convolutional Model for Visual Recognition

Xili Dai; Mingyang Li; Pengyuan Zhai; Shengbang Tong; Xingjian Gao; Shao-Lun Huang; Zhihui Zhu; Chong You; Yi Ma

視覚認識のためのスパース畳み込みモデルの再検討

画像分類の強力な経験的性能にもかかわらず、ディープニューラルネットワークはしばしば「ブラックボックス」と見なされ、解釈が困難です。一方、畳み込み辞書のいくつかの要素の線形結合によって信号を表現できると仮定するスパース畳み込みモデルは、優れた理論的解釈可能性と生物学的妥当性を備えた自然画像を分析するための強力なツールです。ただし、このような原則に基づいたモデルは、経験的に設計されたディープネットワークと比較した場合、競争力のあるパフォーマンスを示していません。このホワイトペーパーでは、画像分類のためのスパース畳み込みモデリングを再検討し、(ディープラーニングの) 優れた経験的パフォーマンスと (スパース畳み込みモデルの) 優れた解釈可能性との間のギャップを埋めます。この方法では、畳み込みスパースコーディングから定義された微分可能な最適化レイヤーを、従来のディープニューラルネットワークの標準的な畳み込みレイヤーのドロップイン置換として使用します。このようなモデルは、従来のニューラルネットワークと比較して、CIFAR-10、CIFAR-100、および ImageNet データセットで同様に強力な経験的パフォーマンスを発揮することを示しています。スパースモデリングの安定した回復特性を活用することにより、スパース正則化とデータ再構築項の間の単純な適切なトレードオフを通じて、そのようなモデルが入力の破損やテストでの敵対的摂動に対してはるかに堅牢になり得ることをさらに示します。ソースコードは https://github.com/Delay-Xili/SDNet にあります。

Despite strong empirical performance for image classification, deep neural networks are often regarded as ``black boxes'' and they are difficult to interpret. On the other hand, sparse convolutional models, which assume that a signal can be expressed by a linear combination of a few elements from a convolutional dictionary, are powerful tools for analyzing natural images with good theoretical interpretability and biological plausibility. However, such principled models have not demonstrated competitive performance when compared with empirically designed deep networks. This paper revisits the sparse convolutional modeling for image classification and bridges the gap between good empirical performance (of deep learning) and good interpretability (of sparse convolutional models). Our method uses differentiable optimization layers that are defined from convolutional sparse coding as drop-in replacements of standard convolutional layers in conventional deep neural networks. We show that such models have equally strong empirical performance on CIFAR-10, CIFAR-100, and ImageNet datasets when compared to conventional neural networks. By leveraging stable recovery property of sparse modeling, we further show that such models can be much more robust to input corruptions as well as adversarial perturbations in testing through a simple proper trade-off between sparse regularization and data reconstruction terms. Source code can be found at https://github.com/Delay-Xili/SDNet.

updated: Mon Oct 24 2022 04:29:21 GMT+0000 (UTC)

published: Mon Oct 24 2022 04:29:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト