Learning to search for and detect objects in foveal images using deep learning

Beatriz Paula; Plinio Moreno

深層学習を使用して中心窩画像内のオブジェクトを検索および検出する学習

人間の視覚系は、網膜の小さな部分である中心窩を使用して、さまざまな程度の解像度で画像を処理し、視野の周辺に向かって徐々に低下する最高の視力領域を捉えます。ただし、既存のオブジェクトローカリゼーション方法の大部分は、生物学的注意メカニズムを無視して、空間不変の解像度を持つイメージセンサーによって取得された画像に依存しています。関心領域プーリングとして、この研究では、画像内の特定のクラスを検索する人間の客観的誘導の注意をエミュレートする固定予測モデルを採用しています。次に、各注視点での中心窩画像を分類して、シーン内にターゲットが存在するか存在しないかを判断します。この 2 段階のパイプラインメソッド全体を通じて、高レベルまたはパノプティック機能を利用して得られたさまざまな結果を調査し、問題の空間構造をより適切に考慮して、よりスムーズな固定シーケンスのグラウンドトゥルースラベル関数を提供します。最後に、固定予測と検出を同時に実行できる新しいデュアルタスクモデルを提示し、2 つのタスク間の知識の伝達を可能にします。両方のタスクの補完的な性質により、トレーニングプロセスは知識の共有から恩恵を受け、以前のアプローチのベースラインスコアと比較してパフォーマンスが向上したと結論付けています.

The human visual system processes images with varied degrees of resolution, with the fovea, a small portion of the retina, capturing the highest acuity region, which gradually declines toward the field of view's periphery. However, the majority of existing object localization methods rely on images acquired by image sensors with space-invariant resolution, ignoring biological attention mechanisms. As a region of interest pooling, this study employs a fixation prediction model that emulates human objective-guided attention of searching for a given class in an image. The foveated pictures at each fixation point are then classified to determine whether the target is present or absent in the scene. Throughout this two-stage pipeline method, we investigate the varying results obtained by utilizing high-level or panoptic features and provide a ground-truth label function for fixation sequences that is smoother, considering in a better way the spatial structure of the problem. Finally, we present a novel dual task model capable of performing fixation prediction and detection simultaneously, allowing knowledge transfer between the two tasks. We conclude that, due to the complementary nature of both tasks, the training process benefited from the sharing of knowledge, resulting in an improvement in performance when compared to the previous approach's baseline scores.

updated: Wed Apr 12 2023 09:50:25 GMT+0000 (UTC)

published: Wed Apr 12 2023 09:50:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト