Eliminate Deviation with Deviation for Data Augmentation and a General Multi-modal Data Learning Method

Yunpeng Gong; Liqing Huang; Lifei Chen

データ拡張と一般的なマルチモーダルデータ学習方法のための偏差で偏差を排除します

コンピュータビジョンの課題の1つは、変化する環境での色の偏差に適応する必要があることです。したがって、予測に対する色の偏差の悪影響を最小限に抑えることは、視覚タスクの主な目標の1つです。現在のソリューションは、生成モデルを使用してトレーニングデータを拡張し、入力変動の不変性を強化することに重点を置いています。ただし、このような方法では新しいノイズが発生することが多く、生成されたデータからのゲインが制限されます。この目的のために、この論文は、ランダムカラードロップアウト（RCD）と呼ばれる、偏差を伴う偏差を排除する戦略を提案します。クエリ画像とギャラリー画像の間に色のずれがある場合、色情報を無視すると、いくつかの例の検索結果が良くなるという仮説が立てられています。具体的には、この戦略は、色の逸脱の影響を克服するために、トレーニングデータの部分的な色情報をドロップアウトすることにより、ニューラルネットワークの色の特徴と色に依存しない特徴の間の重みのバランスを取ります。提案されたRCDは、学習戦略を変更することなく、さまざまな既存のReIDモデルと組み合わせることができ、オブジェクト検出などの他のコンピュータービジョン分野に適用できます。いくつかのReIDベースラインと、Market1501、DukeMTMC、MSMT17などの3つの一般的な大規模データセットでの実験により、この方法の有効性が検証されました。クロスドメインテストの実験では、この戦略がドメインギャップを排除するのに重要であることが示されています。さらに、RCDの動作メカニズムを理解するために、分類の観点からこの戦略の有効性を分析しました。これにより、ドメインのバリエーションが強い視覚タスクでは、すべての色情報ではなく、多くの色情報を利用する方がよい場合があります。

One of the challenges of computer vision is that it needs to adapt to color deviations in changeable environments. Therefore, minimizing the adverse effects of color deviation on the prediction is one of the main goals of vision task. Current solutions focus on using generative models to augment training data to enhance the invariance of input variation. However, such methods often introduce new noise, which limits the gain from generated data. To this end, this paper proposes a strategy eliminate deviation with deviation, which is named Random Color Dropout (RCD). Our hypothesis is that if there are color deviation between the query image and the gallery image, the retrieval results of some examples will be better after ignoring the color information. Specifically, this strategy balances the weights between color features and color-independent features in the neural network by dropouting partial color information in the training data, so as to overcome the effect of color devitaion. The proposed RCD can be combined with various existing ReID models without changing the learning strategy, and can be applied to other computer vision fields, such as object detection. Experiments on several ReID baselines and three common large-scale datasets such as Market1501, DukeMTMC, and MSMT17 have verified the effectiveness of this method. Experiments on Cross-domain tests have shown that this strategy is significant eliminating the domain gap. Furthermore, in order to understand the working mechanism of RCD, we analyzed the effectiveness of this strategy from the perspective of classification, which reveals that it may be better to utilize many instead of all of color information in visual tasks with strong domain variations.

updated: Mon Jun 13 2022 14:16:14 GMT+0000 (UTC)

published: Thu Jan 21 2021 10:33:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト