Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

Jinpeng Wang; Yuting Gao; Ke Li; Yiqi Lin; Andy J. Ma; Hao Cheng; Pai Peng; Rongrong Ji; Xing Sun

背景を追加して背景を削除する：背景に向けて堅牢な自己教師ありビデオ表現学習

自己教師あり学習は、データ自体から教師を取得することにより、ディープニューラルネットワークのビデオ表現能力を向上させる大きな可能性を示しています。ただし、現在の方法の一部はバックグラウンドから不正を行う傾向があります。つまり、予測はモーションではなくビデオのバックグラウンドに大きく依存しているため、モデルはバックグラウンドの変更に対して脆弱になります。背景に対するモデルの依存を軽減するために、背景を追加して背景の影響を取り除くことを提案します。つまり、ビデオが与えられた場合、静的フレームをランダムに選択し、それを1つおきのフレームに追加して、気が散るビデオサンプルを作成します。次に、モデルに気を散らすビデオの特徴と元のビデオの特徴を近づけるように強制します。これにより、モデルは背景の影響に抵抗するように明示的に制限され、動きの変化に焦点が当てられます。この方法をバックグラウンド消去（BE）と呼びます。私たちのメソッドの実装は非常にシンプルできちんとしていて、ほとんどのSOTAメソッドに多くの労力をかけずに追加できることは注目に値します。具体的には、BEは、偏りの激しいデータセットUCF101とHMDB51でMoCoを使用して16.4％と19.1％の改善をもたらし、偏りの少ないデータセットDiving48で14.5％の改善をもたらします。

Self-supervised learning has shown great potentials in improving the video representation ability of deep neural networks by getting supervision from the data itself. However, some of the current methods tend to cheat from the background, i.e., the prediction is highly dependent on the video background instead of the motion, making the model vulnerable to background changes. To mitigate the model reliance towards the background, we propose to remove the background impact by adding the background. That is, given a video, we randomly select a static frame and add it to every other frames to construct a distracting video sample. Then we force the model to pull the feature of the distracting video and the feature of the original video closer, so that the model is explicitly restricted to resist the background influence, focusing more on the motion changes. We term our method as Background Erasing (BE). It is worth noting that the implementation of our method is so simple and neat and can be added to most of the SOTA methods without much efforts. Specifically, BE brings 16.4% and 19.1% improvements with MoCo on the severely biased datasets UCF101 and HMDB51, and 14.5% improvement on the less biased dataset Diving48.

updated: Wed Mar 03 2021 11:52:16 GMT+0000 (UTC)

published: Sat Sep 12 2020 11:25:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト