Improve Video Representation with Temporal Adversarial Augmentation

Jinhao Duan; Quanfu Fan; Hao Cheng; Xiaoshuang Shi; Kaidi Xu

Temporal Adversarial Augmentation によるビデオ表現の改善

最近の研究では、適切な方法で使用された場合、敵対的拡張がニューラルネットワーク (NN) の一般化に役立つことが明らかになりました。このホワイトペーパーでは、時間的注意を利用する新しいビデオ拡張技術であるTemporal Adversarial Augmentation（TA）を紹介します。従来の敵対的増強とは異なり、TA は、時間関連の損失関数を最大化することにより、ビデオクリップに関してニューラルネットワークの注意分布をシフトするように特別に設計されています。 TA がニューラルネットワークの焦点に大きく影響する多様な時間ビューを取得することを示します。これらの例を使用したトレーニングは、不均衡な時間情報認識の欠陥を修正し、時間シフトに対する防御能力を高め、最終的により良い一般化につながります。 TAを活用するために、ビデオ表現を改善するためのTemporal Video Adversarial Fine-tuning（TAF）フレームワークを提案します。 TAF は、モデルにとらわれず、一般的で、解釈しやすいトレーニング戦略です。時間に関連する 3 つの困難なベンチマーク (Something-something V1&V2 およびダイビング48) で、4 つの強力なモデル (TSM、GST、TAM、および TPN) を使用して TAF を評価します。実験結果は、追加のパラメーターや計算コストを導入することなく、TAF がこれらのモデルのテスト精度を大幅に改善することを示しています。副産物として、TAF は配布外 (OOD) 設定での堅牢性も向上させます。コードは https://github.com/jinhaoduan/TAF で入手できます。

Recent works reveal that adversarial augmentation benefits the generalization of neural networks (NNs) if used in an appropriate manner. In this paper, we introduce Temporal Adversarial Augmentation (TA), a novel video augmentation technique that utilizes temporal attention. Unlike conventional adversarial augmentation, TA is specifically designed to shift the attention distributions of neural networks with respect to video clips by maximizing a temporal-related loss function. We demonstrate that TA will obtain diverse temporal views, which significantly affect the focus of neural networks. Training with these examples remedies the flaw of unbalanced temporal information perception and enhances the ability to defend against temporal shifts, ultimately leading to better generalization. To leverage TA, we propose Temporal Video Adversarial Fine-tuning (TAF) framework for improving video representations. TAF is a model-agnostic, generic, and interpretability-friendly training strategy. We evaluate TAF with four powerful models (TSM, GST, TAM, and TPN) over three challenging temporal-related benchmarks (Something-something V1&V2 and diving48). Experimental results demonstrate that TAF effectively improves the test accuracy of these models with notable margins without introducing additional parameters or computational costs. As a byproduct, TAF also improves the robustness under out-of-distribution (OOD) settings. Code is available at https://github.com/jinhaoduan/TAF.

updated: Fri Apr 28 2023 03:06:37 GMT+0000 (UTC)

published: Fri Apr 28 2023 03:06:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト