Diagnosing and Preventing Instabilities in Recurrent Video Processing

Thomas Tanay; Aivar Sootla; Matteo Maggioni; Puneet K. Dokania; Philip Torr; Ales Leonardis; Gregory Slabaugh

反復ビデオ処理における不安定性の診断と防止

再帰モデルは、ビデオのノイズ除去や超解像などのビデオ拡張タスクで一般的な選択肢です。この作業では、動的システムとしての安定性に焦点を当て、長いビデオシーケンスの推論時に壊滅的に失敗する傾向があることを示します。この問題に対処するために、(1) 不安定性をトリガーするように最適化された入力シーケンスを生成し、一時的な受容野の視覚化として解釈できる診断ツールを導入し、(2) トレーニング中にモデルの安定性を強化する 2 つのアプローチを提案します。スペクトルノルムを制約するか、畳み込み層の安定ランクを制約します。次に、これらの制約を適用する新しいアルゴリズムである、畳み込み層の安定ランク正規化 (SRN-C) を紹介します。私たちの実験結果は、SRN-C が、パフォーマンスを大幅に低下させることなく、反復ビデオ処理モデルの安定性を確保することに成功したことを示唆しています。

Recurrent models are a popular choice for video enhancement tasks such as video denoising or super-resolution. In this work, we focus on their stability as dynamical systems and show that they tend to fail catastrophically at inference time on long video sequences. To address this issue, we (1) introduce a diagnostic tool which produces input sequences optimized to trigger instabilities and that can be interpreted as visualizations of temporal receptive fields, and (2) propose two approaches to enforce the stability of a model during training: constraining the spectral norm or constraining the stable rank of its convolutional layers. We then introduce Stable Rank Normalization for Convolutional layers (SRN-C), a new algorithm that enforces these constraints. Our experimental results suggest that SRN-C successfully enforces stability in recurrent video processing models without a significant performance loss.

updated: Sat Mar 11 2023 16:59:38 GMT+0000 (UTC)

published: Sat Oct 10 2020 21:39:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト