arXiv reaDer
SYNAuG: Exploiting Synthetic Data for Data Imbalance Problems
We live in an era of data floods, and deep neural networks play a pivotal role in this moment. Natural data inherently exhibits several challenges such as long-tailed distribution and model fairness, where data imbalance is at the center of fundamental issues. This imbalance poses a risk of deep neural networks producing biased predictions, leading to potentially severe ethical and social problems. To address these problems, we leverage the recent generative models advanced in generating high-quality images. In this work, we propose SYNAuG, which utilizes synthetic data to uniformize the given imbalance distribution followed by a simple post-calibration step considering the domain gap between real and synthetic data. This straightforward approach yields impressive performance on datasets for distinctive data imbalance problems such as CIFAR100-LT, ImageNet100-LT, UTKFace, and Waterbirds, surpassing the performance of existing task-specific methods. While we do not claim that our approach serves as a complete solution to the problem of data imbalance, we argue that supplementing the existing data with synthetic data proves to be an effective and crucial step in addressing data imbalance concerns.
updated: Mon Sep 11 2023 05:06:38 GMT+0000 (UTC)
published: Wed Aug 02 2023 07:59:25 GMT+0000 (UTC)
参考文献 (このサイトで利用可能なもの) / References (only if available on this site)
被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)
Amazon.co.jpアソシエイト