Comparing Correspondences: Video Prediction with Correspondence-wise Losses

Daniel Geng; Andrew Owens

通信の比較：通信に関する損失を伴うビデオ予測

今日の画像予測方法は、シーン内のオブジェクトの位置を変更するのに苦労しており、オブジェクトが占める可能性のある多くの位置で平均してぼやけた画像を生成します。この論文では、既存の画像類似性メトリックに簡単な変更を加えて、位置エラーに対してより堅牢にすることを提案します。オプティカルフローを使用して画像を照合し、対応するピクセルの視覚的類似性を測定します。この変更により、より鮮明で知覚的に正確な予測が可能になり、任意の画像予測ネットワークで使用できます。この方法をビデオの将来のフレームの予測に適用し、シンプルな既製のアーキテクチャで強力なパフォーマンスを実現します。

Today's image prediction methods struggle to change the locations of objects in a scene, producing blurry images that average over the many positions they might occupy. In this paper, we propose a simple change to existing image similarity metrics that makes them more robust to positional errors: we match the images using optical flow, then measure the visual similarity of corresponding pixels. This change leads to crisper and more perceptually accurate predictions, and can be used with any image prediction network. We apply our method to predicting future frames of a video, where it obtains strong performance with simple, off-the-shelf architectures.

updated: Mon Apr 19 2021 17:59:29 GMT+0000 (UTC)

published: Mon Apr 19 2021 17:59:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト