Three approaches to facilitate DNN generalization to objects in out-of-distribution orientations and illuminations: late-stopping, tuning batch normalization and invariance loss

Akira Sakai; Taro Sunagawa; Spandan Madan; Kanata Suzuki; Takashi Katoh; Hiromichi Kobashi; Hanspeter Pfister; Pawan Sinha; Xavier Boix; Tomotake Sasaki

分布外の方向と照明のオブジェクトへのDNNの一般化を容易にする3つのアプローチ：レイトストップ、バッチ正規化の調整、および不変性の損失

トレーニングデータの分布は、特定の方向と照明条件のオブジェクトに偏っていることがよくあります。人間は、分布外（OoD）の方向と照明でオブジェクトを認識する優れた能力を備えていますが、この場合、大量のトレーニング例が利用できる場合でも、ディープニューラルネットワーク（DNN）は深刻な問題を抱えています。この論文では、OoDの向きと照明でオブジェクトを認識する際のDNNを改善するための3つの異なるアプローチを調査します。つまり、これらは、（i）分布内（InD）検証精度の収束後、はるかに長いトレーニング、つまりレイトストップ、（ii）バッチ正規化レイヤーの運動量パラメーターの調整、および（iii）不変性の強制です。配向および照明条件に対する中間層の神経活動。これらの各アプローチにより、DNNのOoD精度が大幅に向上します（場合によっては20％以上）。結果を4つのデータセットで報告します。2つのデータセットはMNISTおよびiLabデータセットから変更され、他の2つは新規です（3Dレンダリングされた車の1つと、さまざまな制御された方向と照明条件から取得されたオブジェクトの1つ）。これらのデータセットは、さまざまな量のバイアスの影響を研究することを可能にし、DNNがOoD条件で不十分に機能するため、困難です。最後に、3つのアプローチがDNNのさまざまな側面に焦点を当てている場合でも、それらはすべて同じ基礎となる神経メカニズムにつながり、OoDの精度を向上させる傾向があることを示します。中間層の個々のニューロンは、カテゴリに対してより選択的であり、不変でもあります。 OoDの向きと照明に。

The training data distribution is often biased towards objects in certain orientations and illumination conditions. While humans have a remarkable capability of recognizing objects in out-of-distribution (OoD) orientations and illuminations, Deep Neural Networks (DNNs) severely suffer in this case, even when large amounts of training examples are available. In this paper, we investigate three different approaches to improve DNNs in recognizing objects in OoD orientations and illuminations. Namely, these are (i) training much longer after convergence of the in-distribution (InD) validation accuracy, i.e., late-stopping, (ii) tuning the momentum parameter of the batch normalization layers, and (iii) enforcing invariance of the neural activity in an intermediate layer to orientation and illumination conditions. Each of these approaches substantially improves the DNN's OoD accuracy (more than 20% in some cases). We report results in four datasets: two datasets are modified from the MNIST and iLab datasets, and the other two are novel (one of 3D rendered cars and another of objects taken from various controlled orientations and illumination conditions). These datasets allow to study the effects of different amounts of bias and are challenging as DNNs perform poorly in OoD conditions. Finally, we demonstrate that even though the three approaches focus on different aspects of DNNs, they all tend to lead to the same underlying neural mechanism to enable OoD accuracy gains -- individual neurons in the intermediate layers become more selective to a category and also invariant to OoD orientations and illuminations.

updated: Sat Oct 30 2021 00:31:13 GMT+0000 (UTC)

published: Sat Oct 30 2021 00:31:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト