Occlusions for Effective Data Augmentation in Image Classification

Ruth Fong; Andrea Vedaldi

画像分類における効果的なデータ増強のためのオクルージョン

視覚認識のための深いネットワークは、顔や独特のテクスチャパターンなどのオブジェクトの「認識しやすい」部分を活用することが知られています。オブジェクトの全体的な理解の欠如は、脆弱性と過剰適合性を高める可能性があります。近年、いくつかの論文が、データ増大の一形態としてオクルージョンによってこの問題に対処することを提案しています。ただし、成功は弱いローカリゼーションやモデル解釈などのタスクに限定されていますが、大規模なデータセットでの画像分類には利点が示されていません。この記事では、バッチ拡張に基づく単純な手法を使用することで、データ拡張としてのオクルージョンにより、大容量モデル（ResNet50など）のImageNetのパフォーマンスが向上することを示しています。また、トレーニング中に使用されるさまざまな量のオクルージョンを使用して、さまざまなニューラルネットワークアーキテクチャの堅牢性を研究できることも示します。

Deep networks for visual recognition are known to leverage "easy to recognise" portions of objects such as faces and distinctive texture patterns. The lack of a holistic understanding of objects may increase fragility and overfitting. In recent years, several papers have proposed to address this issue by means of occlusions as a form of data augmentation. However, successes have been limited to tasks such as weak localization and model interpretation, but no benefit was demonstrated on image classification on large-scale datasets. In this paper, we show that, by using a simple technique based on batch augmentation, occlusions as data augmentation can result in better performance on ImageNet for high-capacity models (e.g., ResNet50). We also show that varying amounts of occlusions used during training can be used to study the robustness of different neural network architectures.

updated: Fri Oct 25 2019 15:25:57 GMT+0000 (UTC)

published: Wed Oct 23 2019 16:19:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト