Augmentation Pathways Network for Visual Recognition

Yalong Bai; Mohan Zhou; Wei Zhang; Bowen Zhou; Tao Mei

視覚認識のための増強経路ネットワーク

データ拡張は、特にデータが不足しているときに、視覚認識に実際に役立ちます。ただし、そのような成功は、かなりの数の軽い増強 (ランダムクロップ、フリップなど) に限定されます。元の画像と拡張された画像の間に大きなギャップがあるため、重い拡張はトレーニング中に不安定になるか、悪影響を示します。このホワイトペーパーでは、Augmentation Pathways (AP) と呼ばれる新しいネットワーク設計を紹介し、より広範な拡張ポリシーでトレーニングを体系的に安定させます。特に、AP はさまざまな大量のデータ増強を抑制し、増強ポリシーを慎重に選択しなくても安定してパフォーマンスを向上させます。従来の単一経路とは異なり、拡張画像は異なる神経経路で処理されます。主な経路は軽い増強を処理しますが、他の経路はより重い増強に焦点を当てています.複数のパスと依存的に相互作用することにより、バックボーンネットワークは拡張間で共有される視覚パターンから堅牢に学習し、同時に重い拡張の副作用を抑制します。さらに、高次シナリオ向けに AP を高次バージョンに拡張し、実際の使用における堅牢性と柔軟性を実証します。 ImageNet での実験結果は、はるかに広い範囲の拡張に対する互換性と有効性を示し、推論時に消費するパラメーターと計算コストを削減します。

Data augmentation is practically helpful for visual recognition, especially at the time of data scarcity. However, such success is only limited to quite a few light augmentations (e.g., random crop, flip). Heavy augmentations are either unstable or show adverse effects during training, owing to the big gap between the original and augmented images. This paper introduces a novel network design, noted as Augmentation Pathways (AP), to systematically stabilize training on a much wider range of augmentation policies. Notably, AP tames various heavy data augmentations and stably boosts performance without a careful selection among augmentation policies. Unlike traditional single pathway, augmented images are processed in different neural paths. The main pathway handles the light augmentations, while other pathways focus on the heavier augmentations. By interacting with multiple paths in a dependent manner, the backbone network robustly learns from shared visual patterns among augmentations, and suppresses the side effect of heavy augmentations at the same time. Furthermore, we extend AP to high-order versions for high-order scenarios, demonstrating its robustness and flexibility in practical usage. Experimental results on ImageNet demonstrate the compatibility and effectiveness on a much wider range of augmentations, while consuming fewer parameters and lower computational costs at inference time.

updated: Thu Mar 16 2023 05:23:18 GMT+0000 (UTC)

published: Mon Jul 26 2021 06:54:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト