Accelerating the Training of Video Super-Resolution

Lijian Lin; Xintao Wang; Zhongang Qi; Ying Shan

ビデオスーパーレゾリューションのトレーニングを加速する

畳み込みニューラルネットワーク（CNN）が最近、ビデオ超解像度（VSR）の高品質な再構築を実証したにもかかわらず、競争力のあるVSRモデルを効率的にトレーニングすることは依然として困難な問題です。通常、対応する画像モデルのトレーニングよりも桁違いに時間がかかり、長い研究サイクルにつながります。既存のVSRメソッドは通常、最初から最後まで固定された空間的および時間的サイズでモデルをトレーニングします。固定サイズは通常、パフォーマンスを向上させるために大きな値に設定されているため、トレーニングが遅くなります。しかし、VSRにはそのような厳格なトレーニング戦略が必要ですか？この作業では、ビデオモデルを小さいサイズから大きいサイズ/時間サイズまで、つまり簡単に難しい方法で徐々にトレーニングできることを示します。特に、トレーニング全体はいくつかの段階に分割され、初期の段階ではトレーニングの空間形状が小さくなります。各ステージ内では、空間サイズは変更されないまま、時間サイズも短いものから長いものへと変化します。ほとんどの計算はより小さな空間的およびより短い時間的形状で実行されるため、トレーニングはそのようなマルチグリッドトレーニング戦略によって加速されます。 GPU並列化によるさらなる高速化のために、精度を損なうことなく大規模なミニバッチトレーニングも調査します。広範な実験により、私たちの方法は、さまざまなVSRモデルのパフォーマンスを低下させることなく、トレーニングを大幅に高速化できることが実証されています（壁掛け時計のトレーニング時間で最大6.2倍の高速化）。コードはhttps://github.com/TencentARC/Efficient-VSR-Trainingで入手できます。

Despite that convolution neural networks (CNN) have recently demonstrated high-quality reconstruction for video super-resolution (VSR), efficiently training competitive VSR models remains a challenging problem. It usually takes an order of magnitude more time than training their counterpart image models, leading to long research cycles. Existing VSR methods typically train models with fixed spatial and temporal sizes from beginning to end. The fixed sizes are usually set to large values for good performance, resulting to slow training. However, is such a rigid training strategy necessary for VSR? In this work, we show that it is possible to gradually train video models from small to large spatial/temporal sizes, i.e., in an easy-to-hard manner. In particular, the whole training is divided into several stages and the earlier stage has smaller training spatial shape. Inside each stage, the temporal size also varies from short to long while the spatial size remains unchanged. Training is accelerated by such a multigrid training strategy, as most of computation is performed on smaller spatial and shorter temporal shapes. For further acceleration with GPU parallelization, we also investigate the large minibatch training without the loss in accuracy. Extensive experiments demonstrate that our method is capable of largely speeding up training (up to 6.2× speedup in wall-clock training time) without performance drop for various VSR models. The code is available at https://github.com/TencentARC/Efficient-VSR-Training.

updated: Tue May 10 2022 17:55:24 GMT+0000 (UTC)

published: Tue May 10 2022 17:55:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト