StackMix: A complementary Mix algorithm

John Chen; Samarth Sinha; Anastasios Kyrillidis

StackMix：補完的なMixアルゴリズム

複数の画像を入力/出力として組み合わせる手法は、畳み込みニューラルネットワークをトレーニングするための効果的なデータ拡張であることが証明されています。このホワイトペーパーでは、StackMixを紹介します。各入力は2つの画像の連結として表示され、ラベルは2つのワンホットラベルの平均です。 StackMixは、それ自体で、「Mix」ラインで広く使用されている他の方法に匹敵します。さらに重要なことに、以前の作業とは異なり、StackMixを既存のMix拡張と組み合わせて、3つ以上の画像を効果的に混合することにより、さまざまなベンチマークで大幅な向上が達成されます。たとえば、StackMixとCutMixを組み合わせることにより、監視対象設定のテストエラーは、ImageNetで0.8％、Tiny ImageNetで3％、CIFAR-100で2％、CIFAR-10で0.5％など、CutMixのさまざまな設定で改善されます。 STL-10では1.5％。 Mixupでも同様の結果が得られます。さらに、StackMixとAugMixをAugMixよりもAugMixと組み合わせることで、さまざまな重大度での一般的な入力の破損や摂動に対する堅牢性が維持され、CIFAR-100-Cが0.7％向上することを示しています。 StackMixによる改善は、CIFAR-100のさまざまな数のラベル付きサンプルに適用され、データセット全体のわずか5％を使用するまで、テスト精度の約2％のギャップを維持し、半教師あり学習で効果的です。標準ベンチマークΠモデルで2％改善された設定。最後に、提案されたアルゴリズムをよりよく理解するために、広範なアブレーション研究を実行します。

Techniques combining multiple images as input/output have proven to be effective data augmentations for training convolutional neural networks. In this paper, we present StackMix: Each input is presented as a concatenation of two images, and the label is the mean of the two one-hot labels. On its own, StackMix rivals other widely used methods in the "Mix" line of work. More importantly, unlike previous work, significant gains across a variety of benchmarks are achieved by combining StackMix with existing Mix augmentation, effectively mixing more than two images. E.g., by combining StackMix with CutMix, test error in the supervised setting is improved across a variety of settings over CutMix, including 0.8% on ImageNet, 3% on Tiny ImageNet, 2% on CIFAR-100, 0.5% on CIFAR-10, and 1.5% on STL-10. Similar results are achieved with Mixup.We further show that gains hold for robustness to common input corruptions and perturbations at varying severities with a 0.7% improvement on CIFAR-100-C, by combining StackMix with AugMix over AugMix. On its own, improvements with StackMix hold across different number of labeled samples on CIFAR-100, maintaining approximately a 2% gap in test accuracy -- down to using only 5% of the whole dataset -- and is effective in the semi-supervised setting with a 2% improvement with the standard benchmark Π-model. Finally, we perform an extensive ablation study to better understand the proposed algorithm.

updated: Wed Mar 17 2021 16:49:41 GMT+0000 (UTC)

published: Wed Nov 25 2020 10:15:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト