Robust High-Resolution Video Matting with Temporal Guidance

Shanchuan Lin; Linjie Yang; Imran Saleemi; Soumyadip Sengupta

時間的ガイダンスを備えた堅牢な高解像度ビデオマット

新しい最先端のパフォーマンスを実現する、堅牢でリアルタイムの高解像度のヒューマンビデオマット手法を紹介します。私たちの方法は、以前のアプローチよりもはるかに軽量で、Nvidia GTX 1080TiGPUで76FPSの4Kと104FPSのHDを処理できます。独立した画像としてフレームごとにビデオマットを実行するほとんどの既存の方法とは異なり、私たちの方法は、繰り返しアーキテクチャを使用してビデオの時間情報を活用し、時間コヒーレンスとマット品質の大幅な改善を実現します。さらに、マットとセグメンテーションの両方の目的でネットワークを強化する新しいトレーニング戦略を提案します。これにより、モデルの堅牢性が大幅に向上します。私たちの方法は、トライマップや事前にキャプチャされた背景画像などの補助入力を必要としないため、既存の人間のマットアプリケーションに広く適用できます。

We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance. Our method is much lighter than previous approaches and can process 4K at 76 FPS and HD at 104 FPS on an Nvidia GTX 1080Ti GPU. Unlike most existing methods that perform video matting frame-by-frame as independent images, our method uses a recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and matting quality. Furthermore, we propose a novel training strategy that enforces our network on both matting and segmentation objectives. This significantly improves our model's robustness. Our method does not require any auxiliary inputs such as a trimap or a pre-captured background image, so it can be widely applied to existing human matting applications.

updated: Wed Aug 25 2021 23:48:15 GMT+0000 (UTC)

published: Wed Aug 25 2021 23:48:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト