A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Asifullah Khan; Anabia Sohail; Umme Zahoora; Aqsa Saeed Qureshi

深い畳み込みニューラルネットワークの最近のアーキテクチャの調査

ディープたたみ込みニューラルネットワーク（CNN）は特別なタイプのニューラルネットワークで、コンピュータービジョンと画像処理に関連するいくつかの競争で模範的なパフォーマンスを示しています。 CNNのエキサイティングなアプリケーション領域には、画像分類とセグメンテーション、オブジェクト検出、ビデオ処理、自然言語処理、音声認識などがあります。ディープCNNの強力な学習能力は、主にデータから表現を自動的に学習できる複数の特徴抽出ステージを使用しているためです。大量のデータの可用性とハードウェアテクノロジーの改善により、CNNの研究が加速され、最近興味深い興味深いCNNアーキテクチャが報告されています。さまざまなアクティベーションおよびロス関数の使用、パラメーターの最適化、正則化、およびアーキテクチャーの革新など、CNNに進歩をもたらすいくつかの刺激的なアイデアが探究されてきました。ただし、ディープCNNの表現能力の大幅な改善は、アーキテクチャーの革新により実現されています。特に、空間情報とチャネル情報、アーキテクチャの深さと幅、およびマルチパス情報処理を活用するアイデアは、かなりの注目を集めています。同様に、レイヤーのブロックを構造単位として使用するアイデアも人気を集めています。したがって、この調査では、最近報告されたディープCNNアーキテクチャに存在する固有の分類法に焦点を当て、その結果、CNNアーキテクチャの最近の革新を7つの異なるカテゴリに分類します。これらの7つのカテゴリは、空間の活用、深度、マルチパス、幅、機能マップの活用、チャネルのブースト、および注意に基づいています。さらに、CNNコンポーネントの基本的な理解、現在の課題、およびCNNのアプリケーションも提供されます。

Deep Convolutional Neural Network (CNN) is a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing. Some of the exciting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, and Speech Recognition. The powerful learning ability of deep CNN is primarily due to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in the hardware technology has accelerated the research in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the significant improvement in the representational capacity of the deep CNN is achieved through architectural innovations. Notably, the ideas of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing have gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and, consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges, and applications of CNN are also provided.

updated: Sun May 10 2020 12:43:40 GMT+0000 (UTC)

published: Thu Jan 17 2019 23:20:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト