Efficient ConvNet-based Object Detection for Unmanned Aerial Vehicles by Selective Tile Processing

George Plastiras; Christos Kyrkou; Theocharis Theocharides

選択的タイル処理による無人航空機の効率的なConvNetベースのオブジェクト検出

無人航空機（UAV）を利用する多くのアプリケーションでは、コンピュータービジョンアルゴリズムを使用して、搭載カメラから取得した情報を分析する必要があります。深層学習の最近の進歩により、入力画像を処理して関心のあるさまざまなオブジェクトを検出するシングルショット畳み込みニューラルネットワーク（CNN）検出アルゴリズムを使用できるようになりました。計算の要求を低く抑えるために、これらのニューラルネットワークは通常小さな画像サイズで動作しますが、小さなオブジェクトを検出することは困難です。これは、表示範囲のためにオブジェクトが比較的小さく見える傾向があるカメラを装備したUAVを検討する際にさらに強調されます。したがって、このホワイトペーパーでは、大きな入力画像から小さなパッチ（タイル）を抽出し、ニューラルネットワークを使用して処理することで、対象のオブジェクトの解像度を維持する際のトレードオフについて検討します。具体的には、一部のタイルでのみオブジェクトの検出に注目するアテンションメカニズムと、処理されていないタイルの情報を追跡するメモリメカニズムを導入します。さまざまな方法と実験の分析を通じて、処理するタイルを慎重に選択することで、単一の画像をサイズ変更および処理するCNNと同等のパフォーマンスを維持しながら、検出精度を大幅に改善できることを示しています。これにより、提案されたアプローチはUAVアプリケーションに適しています。

Many applications utilizing Unmanned Aerial Vehicles (UAVs) require the use of computer vision algorithms to analyze the information captured from their on-board camera. Recent advances in deep learning have made it possible to use single-shot Convolutional Neural Network (CNN) detection algorithms that process the input image to detect various objects of interest. To keep the computational demands low these neural networks typically operate on small image sizes which, however, makes it difficult to detect small objects. This is further emphasized when considering UAVs equipped with cameras where due to the viewing range, objects tend to appear relatively small. This paper therefore, explores the trade-offs involved when maintaining the resolution of the objects of interest by extracting smaller patches (tiles) from the larger input image and processing them using a neural network. Specifically, we introduce an attention mechanism to focus on detecting objects only in some of the tiles and a memory mechanism to keep track of information for tiles that are not processed. Through the analysis of different methods and experiments we show that by carefully selecting which tiles to process we can considerably improve the detection accuracy while maintaining comparable performance to CNNs that resize and process a single image which makes the proposed approach suitable for UAV applications.

updated: Thu Nov 14 2019 12:50:27 GMT+0000 (UTC)

published: Thu Nov 14 2019 12:50:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト