MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Yingtian Zou; Vikas Verma; Sarthak Mittal; Wai Hoh Tang; Hieu Pham; Juho Kannala; Yoshua Bengio; Arno Solin; Kenji Kawaguchi

MixupE: 方向導関数の観点から見たミックスアップの理解と改善

Mixup は、ディープニューラルネットワークをトレーニングするための一般的なデータ拡張手法であり、入力とそのラベルのペアを線形補間することによって追加のサンプルが生成されます。この手法は、多くの学習パラダイムやアプリケーションで汎化パフォーマンスを向上させることが知られています。この研究では、最初に Mixup を分析し、それがすべての次数の無限に多くの方向導関数を暗黙的に正規化することを示します。この新しい洞察に基づいて、バニラ Mixup よりも優れた汎化パフォーマンスを提供することが理論的に正当化される、Mixup の改良版を提案します。提案手法の有効性を実証するために、画像、表形式データ、音声、グラフなどのさまざまな領域にわたって実験を実施します。私たちの結果は、提案された方法がさまざまなアーキテクチャを使用して複数のデータセットにわたる混合を改善することを示しており、たとえば、ImageNet トップ 1 の精度において混合を 0.8% 改善することが示されています。

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.

updated: Wed Jun 14 2023 00:01:33 GMT+0000 (UTC)

published: Tue Dec 27 2022 07:03:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト