Borrow from Anywhere: Pseudo Multi-modal Object Detection in Thermal Imagery

Chaitanya Devaguptapu; Ninad Akolekar; Manuj M Sharma; Vineeth N Balasubramanian

どこからでも借りる：熱画像における疑似マルチモーダルオブジェクト検出

ビジュアルRGBなどの豊富なドメインから機能を借用して、サーマルドメインでの検出を改善できますか？この論文では、熱画像における物体検出の性能を向上させるために、自然画像領域データでトレーニングされた疑似マルチモーダル物体検出器を提案します。今日一般的であるように、視覚的なRGBドメインの大規模なデータセットと熱ドメインの比較的小さなデータセット（インスタンスの観点から）へのアクセスを想定しています。既知の画像から画像への変換フレームワークを使用して、特定の熱画像に相当する疑似RGBを生成し、マルチモーダルアーキテクチャを使用して熱画像内のオブジェクトを検出することを提案します。私たちのフレームワークは、2つのドメインからのペアのトレーニング例を明示的に必要とせずに、既存のベンチマークよりも優れていることを示しています。また、このアプローチを使用すると、フレームワークが熱領域のデータを少なくして学習できることも示しています。コードと事前トレーニング済みモデルは、https：//github.com/tdchaitanya/MMTODで入手できます。

Can we improve detection in the thermal domain by borrowing features from rich domains like visual RGB? In this paper, we propose a pseudo-multimodal object detector trained on natural image domain data to help improve the performance of object detection in thermal images. We assume access to a large-scale dataset in the visual RGB domain and relatively smaller dataset (in terms of instances) in the thermal domain, as is common today. We propose the use of well-known image-to-image translation frameworks to generate pseudo-RGB equivalents of a given thermal image and then use a multi-modal architecture for object detection in the thermal image. We show that our framework outperforms existing benchmarks without the explicit need for paired training examples from the two domains. We also show that our framework has the ability to learn with less data from thermal domain when using our approach. Our code and pre-trained models are made available at https://github.com/tdchaitanya/MMTOD

updated: Wed Jul 15 2020 15:50:35 GMT+0000 (UTC)

published: Tue May 21 2019 12:24:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト