FSCNN: A Fast Sparse Convolution Neural Network Inference System

Bo Ji; Tianyi Chen

FSCNN: 高速スパース畳み込みニューラルネットワーク推論システム

畳み込みニューラルネットワーク (CNN) は目覚ましい成功を収めていますが、通常、高い計算コストと多数の冗長な重みパラメーターが伴います。 FLOP を削減するために、粗粒度のスパース性を導入して隠れた構造全体を削除する構造プルーニングが一般的なアプローチです。一方、豊富なプルーニング作業では、代わりにきめの細かいスパース性が活用されますが (スパース性はランダムに分散されます)、そのスパースモデルには、潜在的な高速化のための特別に設計されたコンピューティングライブラリがありません。このテクニカルレポートでは、効率的な畳み込みニューラルネットワーク推論システムを研究して提示し、圧縮された CNN のきめの細かいスパース性を利用してフォワードパスを高速化します。私たちが開発した FSCNN は、特別に設計されたスパースデータ構造、演算子、および関連するアルゴリズムのセットに基づいて確立されます。実験的に、十分に高いスパース性が示される場合、FSCNN が VGG16 などの一般的な CNN アーキテクチャで標準の深層学習ライブラリ PyTorch よりも優れていることを検証します。ただし、スパースオペレーターの隣接性の問題により、FSCNN は通常、高度に最適化されたデンスオペレーターと比較できません。したがって、一般的なモデルの圧縮には、粗粒度 (構造化) のスパース性をお勧めします。

Convolution neural networks (CNNs) have achieved remarkable success, but typically accompany high computation cost and numerous redundant weight parameters. To reduce the FLOPs, structure pruning is a popular approach to remove the entire hidden structures via introducing coarse-grained sparsity. Meanwhile, plentiful pruning works leverage fine-grained sparsity instead (sparsity are randomly distributed), whereas their sparse models lack special designed computing library for potential speedup. In this technical report, we study and present an efficient convolution neural network inference system to accelerate its forward pass by utilizing the fine-grained sparsity of compressed CNNs. Our developed FSCNN is established based on a set of specialized designed sparse data structures, operators and associated algorithms. Experimentally, we validate that FSCNN outperforms standard deep learning library PyTorch on popular CNN architectures such as VGG16 if sufficiently high sparsity exhibits. However, due to the contiguity issue of sparse operators, FSCNN is typically not comparable with highly optimized dense operator. Therefore, coarse-grained (structured) sparsity is our recommendation for generic model compression.

updated: Sat Dec 17 2022 06:44:58 GMT+0000 (UTC)

published: Sat Dec 17 2022 06:44:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト