Improving Model Generalization by Agreement of Learned Representations from Data Augmentation

Rowel Atienza

データ拡張から学習した表現の合意によるモデルの一般化の改善

データ拡張は、入力画像のさまざまな変換が与えられた場合にモデルに不変表現を学習させることにより、汎化誤差を減らします。コンピュータービジョンでは、標準の画像処理機能に加えて、CutOut、MixUp、CutMixなどの地域ドロップアウトに基づくデータ拡張技術とAutoAugmentなどのポリシーベースの選択により、最先端の（SOTA）結果が実証されました。提案されるデータ拡張アルゴリズムの数が増えるにつれ、同じラベルの変換された画像に未開発の値がある可能性があることに気付かずに、常に入出力マッピングの最適化に焦点が当てられています。 2つの変換の表現を強制的に一致させることにより、モデルの汎化誤差をさらに減らすことができると仮定します。提案された方法をAgreementMaximizationまたは単にAgMaxと呼びます。トレーニング中にこの単純な制約を適用すると、データ拡張アルゴリズムにより、ImageNet上のResNet50の分類精度が最大1.5％、CIFAR10上のWideResNet40-2が最大0.7％、CIFAR100上のWideResNet40-2が最大でさらに向上することが実証されています。 1.6％、音声コマンドデータセットのLeNet5は最大1.4％。実験結果はさらに、ラベル平滑化などの他の正則化項とは異なり、AgMaxはデータ拡張を利用して、モデルの一般化を大幅に改善できることを示しています。 PascalVOCやCOCOでのオブジェクト検出やセグメンテーションなどのダウンストリームタスクでは、AgMaxの事前トレーニング済みモデルは、他のデータ拡張方法よりも1.0mAP（ボックス）および0.5mAP（マスク）も優れています。コードはhttps://github.com/roatienza/agmaxで入手できます。

Data augmentation reduces the generalization error by forcing a model to learn invariant representations given different transformations of the input image. In computer vision, on top of the standard image processing functions, data augmentation techniques based on regional dropout such as CutOut, MixUp, and CutMix and policy-based selection such as AutoAugment demonstrated state-of-the-art (SOTA) results. With an increasing number of data augmentation algorithms being proposed, the focus is always on optimizing the input-output mapping while not realizing that there might be an untapped value in the transformed images with the same label. We hypothesize that by forcing the representations of two transformations to agree, we can further reduce the model generalization error. We call our proposed method Agreement Maximization or simply AgMax. With this simple constraint applied during training, empirical results show that data augmentation algorithms can further improve the classification accuracy of ResNet50 on ImageNet by up to 1.5%, WideResNet40-2 on CIFAR10 by up to 0.7%, WideResNet40-2 on CIFAR100 by up to 1.6%, and LeNet5 on Speech Commands Dataset by up to 1.4%. Experimental results further show that unlike other regularization terms such as label smoothing, AgMax can take advantage of the data augmentation to consistently improve model generalization by a significant margin. On downstream tasks such as object detection and segmentation on PascalVOC and COCO, AgMax pre-trained models outperforms other data augmentation methods by as much as 1.0mAP (box) and 0.5mAP (mask). Code is available at https://github.com/roatienza/agmax.

updated: Wed Oct 20 2021 12:44:52 GMT+0000 (UTC)

published: Wed Oct 20 2021 12:44:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト