FFCNN: Fast FPGA based Acceleration for Convolution neural network inference

F. Keddous; H-N. Nguyen; A. Nakib

FFCNN: 畳み込みニューラルネットワーク推論のための FPGA ベースの高速アクセラレーション

畳み込みニューラルネットワーク (FFCNN) の FPGA での高速推論と呼ばれる、大規模な畳み込みニューラルネットワーク用の新しい効率的な OpenCL ベースのアクセラレータを紹介します。 FFCNN は、深くパイプライン化された OpenCL カーネルアーキテクチャに基づいています。前述のように、OpenCL フレームワークなどの高位合成ツールを使用すると、CPU や GPU 向けに設計されたコードを FPGA に簡単に移植できますが、OpenCL コードを FPGA 上で効率的に実行することは依然として困難です。この作業は、OpenCL ハイパフォーマンスコンピューティングアプリケーションの効率的な FPGA 実装を提案することを目的としています。そのために、データの再利用とタスクマッピングの手法も提示され、設計効率が向上します。さらに、FFCNN を開発する際には、次の動機が考慮されました。 1) FFCNN は、Intel OpenCL SDK ベースの FPGA デザインフローに簡単に実装できるように設計されています。 2) FFFCN では、メモリ帯域とスループットを向上させるためにさまざまな手法が統合されています。大規模画像分類のために、2 つのディープ CNN でパフォーマンス分析が行われます。得られた結果、および同じタイプのアーキテクチャを加速するように設計された他の研究との比較は、大幅に改善されたパフォーマンスとリソース使用率によって、提案されたアクセラレータ設計の効率と競争力を示しています。

We present a new efficient OpenCL-based Accelerator for large scale Convolutional Neural Networks called Fast Inference on FPGAs for Convolution Neural Network (FFCNN). FFCNN is based on a deeply pipelined OpenCL kernels architecture. As pointed out before, high-level synthesis tools such as the OpenCL framework can easily port codes originally designed for CPUs and GPUs to FPGAs, but it is still difficult to make OpenCL codes run efficiently on FPGAs. This work aims to propose an efficient FPGA implementation of OpenCL High-Performance Computing Applications. To do so, a Data reuse and task mapping techniques are also presented to improve design efficiency. In addition, the following motivations were taken into account when developing FFCNN: 1) FFCNN has been designed to be easily implemented on Intel OpenCL SDK based FPGA design flow. 2) In FFFCN, different techniques have been integrated to improve the memory band with and throughput. A performance analysis is conducted on two deep CNN for Large-Scale Images classification. The obtained results, and the comparison with other works designed to accelerate the same types of architectures, show the efficiency and the competitiveness of the proposed accelerator design by significantly improved performance and resource utilization.

updated: Sun Aug 28 2022 16:55:25 GMT+0000 (UTC)

published: Sun Aug 28 2022 16:55:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト