PASTA: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization

Prithvijit Chattopadhyay; Kartik Sarangmath; Vivek Vijaykumar; Judy Hoffman

PASTA: Syn-to-Real ドメイン一般化のための比例振幅スペクトルトレーニング拡張

合成データは、ラベル付きの実世界データが不足している環境において、安価で豊富なトレーニングデータを提供します。ただし、合成データでトレーニングされたモデルは、実世界のデータで評価するとパフォーマンスが大幅に低下します。この論文では、すぐに使用できる合成から実へ (syn-to-real) 汎化パフォーマンスを向上させるためのシンプルで効果的な拡張戦略である比例振幅スペクトルトレーニング拡張 (PASTA) を提案します。 PASTA は、フーリエ領域の合成画像の振幅スペクトルを摂動させて、拡張ビューを生成します。具体的には、PASTA を使用して、高周波成分が低周波成分よりも相対的に多く摂動される構造化摂動戦略を提案します。合計 5 回の Syn-to-Real シフトにわたる、セマンティックセグメンテーション (GTAV-to-Real)、オブジェクト検出 (Sim10K-to-Real)、およびオブジェクト認識 (VisDA-C Syn-to-Real) のタスクの場合、 PASTA は、より複雑な最先端の一般化手法を補完しつつ、それよりも優れたパフォーマンスを発揮することがわかりました。

Synthetic data offers the promise of cheap and bountiful training data for settings where labeled real-world data is scarce. However, models trained on synthetic data significantly underperform when evaluated on real-world data. In this paper, we propose Proportional Amplitude Spectrum Training Augmentation (PASTA), a simple and effective augmentation strategy to improve out-of-the-box synthetic-to-real (syn-to-real) generalization performance. PASTA perturbs the amplitude spectra of synthetic images in the Fourier domain to generate augmented views. Specifically, with PASTA we propose a structured perturbation strategy where high-frequency components are perturbed relatively more than the low-frequency ones. For the tasks of semantic segmentation (GTAV-to-Real), object detection (Sim10K-to-Real), and object recognition (VisDA-C Syn-to-Real), across a total of 5 syn-to-real shifts, we find that PASTA outperforms more complex state-of-the-art generalization methods while being complementary to the same.

updated: Fri Sep 22 2023 19:38:35 GMT+0000 (UTC)

published: Fri Dec 02 2022 05:18:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト