Augmentation Pathways Network for Visual Recognition

Yalong Bai; Mohan Zhou; Yuxiang Chen; Wei Zhang; Bowen Zhou; Tao Mei

視覚認識のための増強経路ネットワーク

データ拡張は、特にデータが不足しているときに、視覚的な認識に実際に役立ちます。ただし、このような成功は、かなりの数の軽い増強（たとえば、ランダムクロップ、フリップ）に限定されます。元の画像と拡張された画像の間に大きなギャップがあるため、重い拡張（たとえば、灰色、グリッドシャッフル）は不安定であるか、トレーニング中に悪影響を示します。このホワイトペーパーでは、Augmentation Pathways（AP）と呼ばれる新しいネットワーク設計を紹介し、はるかに広範囲の拡張ポリシーに関するトレーニングを体系的に安定させます。特に、APは大量のデータ拡張を抑制し、拡張ポリシーを慎重に選択しなくてもパフォーマンスを安定して向上させます。従来の単一経路とは異なり、拡張画像はさまざまな神経経路で処理されます。主な経路は軽い増強を処理しますが、他の経路は重い増強に焦点を合わせます。バックボーンネットワークは、複数のパスと依存的に相互作用することにより、拡張間の共有視覚パターンから確実に学習し、同時にノイズの多いパターンを抑制します。さらに、APを同種バージョンと高次シナリオ用の異種バージョンに拡張し、実際の使用における堅牢性と柔軟性を実証します。 ImageNetベンチマークの実験結果は、はるかに広い範囲の拡張（Crop、Gray、Grid Shuffle、RandAugmentなど）での互換性と有効性を示していますが、推論時のパラメーターの消費量と計算コストは低くなっています。ソースコード：https：//github.com/ap-conv/ap-net。

Data augmentation is practically helpful for visual recognition, especially at the time of data scarcity. However, such success is only limited to quite a few light augmentations (e.g., random crop, flip). Heavy augmentations (e.g., gray, grid shuffle) are either unstable or show adverse effects during training, owing to the big gap between the original and augmented images. This paper introduces a novel network design, noted as Augmentation Pathways (AP), to systematically stabilize training on a much wider range of augmentation policies. Notably, AP tames heavy data augmentations and stably boosts performance without a careful selection among augmentation policies. Unlike traditional single pathway, augmented images are processed in different neural paths. The main pathway handles light augmentations, while other pathways focus on heavy augmentations. By interacting with multiple paths in a dependent manner, the backbone network robustly learns from shared visual patterns among augmentations, and suppresses noisy patterns at the same time. Furthermore, we extend AP to a homogeneous version and a heterogeneous version for high-order scenarios, demonstrating its robustness and flexibility in practical usage. Experimental results on ImageNet benchmarks demonstrate the compatibility and effectiveness on a much wider range of augmentations (e.g., Crop, Gray, Grid Shuffle, RandAugment), while consuming fewer parameters and lower computational costs at inference time. Source code:https://github.com/ap-conv/ap-net.

updated: Mon Jul 26 2021 06:54:53 GMT+0000 (UTC)

published: Mon Jul 26 2021 06:54:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト