Comparing Correspondences: Video Prediction with Correspondence-wise Losses

Daniel Geng; Max Hamilton; Andrew Owens

通信の比較：通信ごとの損失を伴うビデオ予測

画像予測方法は、ビデオ予測など、オブジェクトの位置を変更する必要のあるタスクで苦労することが多く、オブジェクトが占める可能性のある多くの位置で平均してぼやけた画像を生成します。この論文では、既存の画像類似性メトリックに簡単な変更を加えて、位置エラーに対してより堅牢にすることを提案します。オプティカルフローを使用して画像を照合し、対応するピクセルの視覚的類似性を測定します。この変更により、より鮮明で知覚的に正確な予測が可能になり、画像予測ネットワークを変更する必要がなくなります。この方法は、単純なネットワークアーキテクチャで強力なパフォーマンスが得られるさまざまなビデオ予測タスクと、密接に関連するビデオ補間タスクに適用されます。コードと結果は、次のWebページで入手できます：https：//dangeng.github.io/CorrWiseLosses

Image prediction methods often struggle on tasks that require changing the positions of objects, such as video prediction, producing blurry images that average over the many positions that objects might occupy. In this paper, we propose a simple change to existing image similarity metrics that makes them more robust to positional errors: we match the images using optical flow, then measure the visual similarity of corresponding pixels. This change leads to crisper and more perceptually accurate predictions, and does not require modifications to the image prediction network. We apply our method to a variety of video prediction tasks, where it obtains strong performance with simple network architectures, and to the closely related task of video interpolation. Code and results are available at our webpage: https://dangeng.github.io/CorrWiseLosses

updated: Thu Mar 31 2022 18:10:51 GMT+0000 (UTC)

published: Mon Apr 19 2021 17:59:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト