Now You See Me: Robust approach to Partial Occlusions

Karthick Prasad Gunasekaran; Nikita Jaiman

Now You See Me: パーシャルオクルージョンへの堅牢なアプローチ

オブジェクトのオクルージョンは、コンピュータビジョンにおいて不可欠な問題の 1 つです。畳み込みニューラルネットワーク (CNN) は、通常の画像分類にさまざまな最先端のアプローチを提供しますが、部分オクルージョンのある画像の分類にはそれほど効果的ではないことが証明されています。部分的なオクルージョンは、オブジェクトが他のオブジェクト/スペースによって部分的にオクルージョンされるシナリオです。この問題が解決されると、さまざまなシナリオを容易にする大きな可能性が秘められています。特に、自動運転のシナリオとその影響に関心があります。自動運転車の研究は、この 10 年間のホットなトピックの 1 つです。さまざまな角度で、運転標識や人、その他の物体が部分的に遮られる状況が数多くあります。犯罪を処理したり、さまざまなグループの収入レベルを予測したりするためにトラフィックデータのビデオ分析にさらに拡張できる状況でのその最重要性を考慮すると、これは多くの方法で悪用される可能性を秘めています。この論文では、Stanford Car Dataset を利用し、それにさまざまなサイズと性質のオクルージョンを追加することによって、合成的に作成された独自のデータセットを紹介します。この作成されたデータセットに対して、VGG-19、ResNet 50/101、GoogleNet、DenseNet 121 などのさまざまな最先端の CNN モデルを使用して包括的な分析を実施しました。さまざまなオクルージョンの比率と性質がパフォーマンスに与える影響をさらに詳しく調査しました。データセットでゼロからこれらを微調整およびトレーニングすることによるこれらのモデルのパフォーマンスと、さまざまなシナリオでトレーニングした場合のパフォーマンス、つまり、オクルージョンされた画像とオクルージョンされていない画像でトレーニングした場合のパフォーマンス、どのモデルが部分的なオクルージョンに対してより堅牢であるかなど。

Occlusions of objects is one of the indispensable problems in Computer vision. While Convolutional Neural Net-works (CNNs) provide various state of the art approaches for regular image classification, they however, prove to be not as effective for the classification of images with partial occlusions. Partial occlusion is scenario where an object is occluded partially by some other object/space. This problem when solved,holds tremendous potential to facilitate various scenarios. We in particular are interested in autonomous driving scenario and its implications in the same. Autonomous vehicle research is one of the hot topics of this decade, there are ample situations of partial occlusions of a driving sign or a person or other objects at different angles. Considering its prime importance in situations which can be further extended to video analytics of traffic data to handle crimes, anticipate income levels of various groups etc.,this holds the potential to be exploited in many ways. In this paper, we introduce our own synthetically created dataset by utilising Stanford Car Dataset and adding occlusions of various sizes and nature to it. On this created dataset, we conducted a comprehensive analysis using various state of the art CNN models such as VGG-19, ResNet 50/101, GoogleNet, DenseNet 121. We further in depth study the effect of varying occlusion proportions and nature on the performance of these models by fine tuning and training these from scratch on dataset and how is it likely to perform when trained in different scenarios, i.e., performance when training with occluded images and unoccluded images, which model is more robust to partial occlusions and soon.

updated: Tue Apr 25 2023 11:45:50 GMT+0000 (UTC)

published: Mon Apr 24 2023 00:31:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト