A Framework for Fast Scalable BNN Inference using Googlenet and Transfer Learning

Karthik E

Googlenetと転移学習を使用した高速でスケーラブルなBNN推論のフレームワーク

ビデオおよび画像分析における効率的で正確な物体検出は、ディープラーニングの助けを借りたコンピュータービジョンシステムの進歩の主な恩恵の1つです。ディープラーニングの助けを借りて、より強力なツールが進化しました。これらのツールは、高レベルでより深い機能を学習できるため、オブジェクト検出アルゴリズムの従来のアーキテクチャにある既存の問題を克服できます。この論文の研究は、優れたリアルタイム性能で物体検出の高精度を達成することを目的としています。コンピュータビジョンの分野では、既存のアルゴリズムを改善することにより、視覚情報の検出と処理の分野で多くの研究が行われています。二値化ニューラルネットワークは、画像分類、オブジェクト検出、セマンティックセグメンテーションなどのさまざまな視覚タスクで高いパフォーマンスを示しています。修正された米国国立標準技術研究所データベース（MNIST）、カナダ高等研究所（CIFAR）、およびストリートビューハウス番号（SVHN）データセットが使用されます。これらのデータセットは、事前にトレーニングされた畳み込みニューラルネットワーク（CNN）を使用して実装されます。深い層。教師あり学習は、モデルの適切な構造で特定のデータセットを分類する作業で使用されます。静止画では、精度を上げるためにGooglenetを使用しています。 Googlenetの最終層は、Googlenetの精度を向上させるために、転移学習に置き換えられています。同時に、動画の精度は、転移学習技術によって維持することができます。ハードウェアは、多数のデータセットでより高速な結果を取得するためのモデルの主要なバックボーンです。ここでは、オブジェクト検出の過程で多数の計算を処理できるグラフィックスプロセッシングユニット（GPU）であるNvidia JetsonNanoが使用されています。結果は、転移学習法によって検出されたオブジェクトの精度は、既存の方法と比較した場合に高いことを示しています。

Efficient and accurate object detection in video and image analysis is one of the major beneficiaries of the advancement in computer vision systems with the help of deep learning. With the aid of deep learning, more powerful tools evolved, which are capable to learn high-level and deeper features and thus can overcome the existing problems in traditional architectures of object detection algorithms. The work in this thesis aims to achieve high accuracy in object detection with good real-time performance. In the area of computer vision, a lot of research is going into the area of detection and processing of visual information, by improving the existing algorithms. The binarized neural network has shown high performance in various vision tasks such as image classification, object detection, and semantic segmentation. The Modified National Institute of Standards and Technology database (MNIST), Canadian Institute for Advanced Research (CIFAR), and Street View House Numbers (SVHN) datasets are used which is implemented using a pre-trained convolutional neural network (CNN) that is 22 layers deep. Supervised learning is used in the work, which classifies the particular dataset with the proper structure of the model. In still images, to improve accuracy, Googlenet is used. The final layer of the Googlenet is replaced with the transfer learning to improve the accuracy of the Googlenet. At the same time, the accuracy in moving images can be maintained by transfer learning techniques. Hardware is the main backbone for any model to obtain faster results with a large number of datasets. Here, Nvidia Jetson Nano is used which is a graphics processing unit (GPU), that can handle a large number of computations in the process of object detection. Results show that the accuracy of objects detected by the transfer learning method is more when compared to the existing methods.

updated: Tue Jan 05 2021 07:28:38 GMT+0000 (UTC)

published: Mon Jan 04 2021 06:16:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト