Leaping Into Memories: Space-Time Deep Feature Synthesis

Alexandros Stergiou; Nikos Deligiannis

Leaping Into Memories: 時空間ディープフィーチャ合成

ディープラーニングモデルの成功は、著名なビデオ理解方法による適応と採用につながりました。これらのアプローチの大部分は、内部の仕組みと学習された表現を視覚的に解釈することが困難な共同時空間モダリティで特徴をエンコードします。モデルの内部時空間表現からビデオを合成するための、アーキテクチャにとらわれない方法である、学習済み前意識合成 (LEAPS) を提案します。刺激ビデオとターゲットクラスを使用して、固定時空間モデルを準備し、ランダムノイズで初期化されたビデオを繰り返し最適化します。追加の正則化を組み込んで、合成されたビデオの機能の多様性と、モーションのクロスフレームの時間的一貫性を改善します。 Kinetics-400 でトレーニングされた一連の時空間畳み込みおよび注意ベースのアーキテクチャを逆にすることにより、LEAPS の適用可能性を定量的および定性的に評価します。

The success of deep learning models has led to their adaptation and adoption by prominent video understanding methods. The majority of these approaches encode features in a joint space-time modality for which the inner workings and learned representations are difficult to visually interpret. We propose LEArned Preconscious Synthesis (LEAPS), an architecture-agnostic method for synthesizing videos from the internal spatiotemporal representations of models. Using a stimulus video and a target class, we prime a fixed space-time model and iteratively optimize a video initialized with random noise. We incorporate additional regularizers to improve the feature diversity of the synthesized videos as well as the cross-frame temporal coherence of motions. We quantitatively and qualitatively evaluate the applicability of LEAPS by inverting a range of spatiotemporal convolutional and attention-based architectures trained on Kinetics-400, which to the best of our knowledge has not been previously accomplished.

updated: Wed Mar 29 2023 06:14:47 GMT+0000 (UTC)

published: Fri Mar 17 2023 12:55:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト