Assessing The Importance Of Colours For CNNs In Object Recognition

Aditya Singh; Alessandro Bay; Andrea Mirabile

オブジェクト認識におけるCNNの色の重要性の評価

人間は、物体認識の主要な手がかりとして形状に大きく依存しています。二次的な手がかりとして、色や質感もこの点で有益です。生物学的ニューラルネットワークの模倣である畳み込みニューラルネットワーク（CNN）は、相反する特性を示すことが示されています。いくつかの研究は、CNNがテクスチャに偏っていることを示していますが、別の一連の研究は、分類タスクの形状の偏りを示唆しています。しかし、彼らは色の役割については議論しておらず、物体認識のタスクにおけるその可能な謙虚な役割を暗示しています。この論文では、CNNのオブジェクト認識における色の重要性を経験的に調査します。 CNNは、予測を行う際に色情報に大きく依存することが多いことを示すことができます。私たちの結果は、色への依存度がデータセットごとに異なる傾向があることを示しています。さらに、ネットワークは、ゼロからトレーニングされた場合、色に依存する傾向があります。事前トレーニングにより、モデルの色への依存度を下げることができます。これらの発見を容易にするために、人間の物体認識における色の役割を理解するためにしばしば展開されるフレームワークに従います。一致、グレースケール、および不一致の画像（青いイチゴなどの不自然な色の画像）で、一致した画像（元の色の画像、たとえば赤いイチゴ）でトレーニングされたモデルを評価します。これらのさまざまなスタイルの下で、ネットワークの予測パフォーマンス（トップ1の精度）を測定および分析します。実験では、教師あり画像分類と細粒度画像分類の標準データセットを利用します。

Humans rely heavily on shapes as a primary cue for object recognition. As secondary cues, colours and textures are also beneficial in this regard. Convolutional neural networks (CNNs), an imitation of biological neural networks, have been shown to exhibit conflicting properties. Some studies indicate that CNNs are biased towards textures whereas, another set of studies suggests shape bias for a classification task. However, they do not discuss the role of colours, implying its possible humble role in the task of object recognition. In this paper, we empirically investigate the importance of colours in object recognition for CNNs. We are able to demonstrate that CNNs often rely heavily on colour information while making a prediction. Our results show that the degree of dependency on colours tend to vary from one dataset to another. Moreover, networks tend to rely more on colours if trained from scratch. Pre-training can allow the model to be less colour dependent. To facilitate these findings, we follow the framework often deployed in understanding role of colours in object recognition for humans. We evaluate a model trained with congruent images (images in original colours eg. red strawberries) on congruent, greyscale, and incongruent images (images in unnatural colours eg. blue strawberries). We measure and analyse network's predictive performance (top-1 accuracy) under these different stylisations. We utilise standard datasets of supervised image classification and fine-grained image classification in our experiments.

updated: Sat Dec 12 2020 22:55:06 GMT+0000 (UTC)

published: Sat Dec 12 2020 22:55:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト