Three approaches to facilitate DNN generalization to objects in out-of-distribution orientations and illuminations

Akira Sakai; Taro Sunagawa; Spandan Madan; Kanata Suzuki; Takashi Katoh; Hiromichi Kobashi; Hanspeter Pfister; Pawan Sinha; Xavier Boix; Tomotake Sasaki

分布外の向きと照明のオブジェクトへのDNNの一般化を容易にする3つのアプローチ

トレーニングデータの分布は、特定の方向と照明条件のオブジェクトに偏っていることがよくあります。人間は、分布外（OoD）の向きと照明でオブジェクトを認識する優れた能力を備えていますが、この場合、大量のトレーニング例が利用できる場合でも、ディープニューラルネットワーク（DNN）は深刻な問題を抱えています。この論文では、OoDの向きと照明でオブジェクトを認識する際にDNNを改善するための3つの異なるアプローチを調査します。つまり、これらは（i）分布内（InD）検証精度の収束後、はるかに長いトレーニング、つまりレイトストップ、（ii）バッチ正規化レイヤーの運動量パラメーターの調整、および（iii）不変性の強制です。配向および照明条件に対する中間層の神経活動。これらの各アプローチにより、DNNのOoD精度が大幅に向上します（場合によっては20％以上）。結果を4つのデータセットで報告します。2つのデータセットはMNISTおよびiLabデータセットから変更され、他の2つは新規です（3Dレンダリングされた車の1つと、さまざまな制御された方向と照明条件から取得されたオブジェクトの1つ）。これらのデータセットは、さまざまな量のバイアスの影響を研究することを可能にし、DNNがOoD条件で不十分に機能するため、困難です。最後に、3つのアプローチがDNNのさまざまな側面に焦点を当てているにもかかわらず、それらはすべて、OoD精度の向上を可能にするために、同じ基礎となる神経メカニズムにつながる傾向があることを示します-中間層の個々のニューロンは、カテゴリに対してより選択的になり、不変にもなりますOoDの向きと照明に。この研究は、安全で公正なAIアプリケーションを実現するために非常に求められているディープニューラルネットワークのOoD一般化パフォーマンスのさらなる改善の基礎になると期待しています。

The training data distribution is often biased towards objects in certain orientations and illumination conditions. While humans have a remarkable capability of recognizing objects in out-of-distribution (OoD) orientations and illuminations, Deep Neural Networks (DNNs) severely suffer in this case, even when large amounts of training examples are available. In this paper, we investigate three different approaches to improve DNNs in recognizing objects in OoD orientations and illuminations. Namely, these are (i) training much longer after convergence of the in-distribution (InD) validation accuracy, i.e., late-stopping, (ii) tuning the momentum parameter of the batch normalization layers, and (iii) enforcing invariance of the neural activity in an intermediate layer to orientation and illumination conditions. Each of these approaches substantially improves the DNN's OoD accuracy (more than 20% in some cases). We report results in four datasets: two datasets are modified from the MNIST and iLab datasets, and the other two are novel (one of 3D rendered cars and another of objects taken from various controlled orientations and illumination conditions). These datasets allow to study the effects of different amounts of bias and are challenging as DNNs perform poorly in OoD conditions. Finally, we demonstrate that even though the three approaches focus on different aspects of DNNs, they all tend to lead to the same underlying neural mechanism to enable OoD accuracy gains --individual neurons in the intermediate layers become more selective to a category and also invariant to OoD orientations and illuminations. We anticipate this study to be a basis for further improvement of deep neural networks' OoD generalization performance, which is highly demanded to achieve safe and fair AI applications.

updated: Wed Jan 26 2022 04:59:15 GMT+0000 (UTC)

published: Sat Oct 30 2021 00:31:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト