Aligning Correlation Information for Domain Adaptation in Action Recognition

Yuecong Xu; Jianfei Yang; Haozhi Cao; Kezhi Mao; Jianxiong Yin; Simon See

アクション認識におけるドメイン適応のための相関情報の調整

ドメインアダプテーション（DA）アプローチは、ドメインシフトに対処し、ネットワークをさまざまなシナリオに適用できるようにします。近年、さまざまな画像DAアプローチが提案されていますが、ビデオDAに関する研究は限られています。これは、部分的には、ビデオの特徴のさまざまなモダリティを適応させることの複雑さによるものです。これには、時空間次元全体のピクセルの長期依存性として抽出された相関特徴が含まれます。相関特徴はアクションクラスと高度に関連しており、監視されたアクション認識タスクを通じて正確なビデオ特徴抽出でその有効性が証明されています。ただし、同じアクションの相関機能は、ドメインのシフトによりドメイン間で異なります。したがって、ピクセル相関を調整することによってアクションビデオを調整するための新しい敵対的相関適応ネットワーク（ACAN）を提案します。 ACANは、ピクセル相関不一致（PCD）と呼ばれる相関情報の分布を最小限に抑えることを目的としています。さらに、ビデオDAの研究は、ドメインシフトが大きいクロスドメインビデオデータセットがないことによっても制限されています。したがって、ドメイン間の統計的差異が大きいためにドメインシフトが大きくなる新しいHMDB-ARIDデータセットを紹介します。このデータセットは、ダークビデオの分類に現在のデータセットを活用するために構築されています。経験的な結果は、既存のビデオDAデータセットと新しいビデオDAデータセットの両方について、提案されたACANの最先端のパフォーマンスを示しています。

Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research towards video DA. This is partly due to the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-term dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore we propose a novel Adversarial Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as Pixel Correlation Discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.

updated: Sun Jul 11 2021 00:13:36 GMT+0000 (UTC)

published: Sun Jul 11 2021 00:13:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト