Temporally Coherent Person Matting Trained on Fake-Motion Dataset

Ivan Molodetskikh; Mikhail Erofeev; Andrey Moskalenko; Dmitry Vatolin

フェイクモーションデータセットで訓練された時間的にコヒーレントな人のマット

トライマップなどの追加のユーザー入力を必要としない人々を描いたビデオのマット化を実行するための新しいニューラルネットワークベースの方法を提案します。私たちのアーキテクチャは、U-Netスキップ接続の畳み込みLSTMモジュールと組み合わせて、画像セグメンテーションアルゴリズム出力のモーション推定ベースの平滑化を使用することにより、結果のアルファマットの時間的安定性を実現します。また、グラウンドトゥルースアルファマットと背景ビデオを含む写真を指定して、ビデオマットネットワークのトレーニングクリップを生成するフェイクモーションアルゴリズムを提案します。写真とそのマットにランダムな動きを適用して、実際のビデオで見られる動きをシミュレートし、その結果を背景クリップと合成します。これにより、大きな注釈付きビデオデータセットがない場合に、ビデオで動作するディープニューラルネットワークをトレーニングでき、損失関数で使用するためのグラウンドトゥルーストレーニングクリップの前景オプティカルフローを提供します。

We propose a novel neural-network-based method to perform matting of videos depicting people that does not require additional user input such as trimaps. Our architecture achieves temporal stability of the resulting alpha mattes by using motion-estimation-based smoothing of image-segmentation algorithm outputs, combined with convolutional-LSTM modules on U-Net skip connections. We also propose a fake-motion algorithm that generates training clips for the video-matting network given photos with ground-truth alpha mattes and background videos. We apply random motion to photos and their mattes to simulate movement one would find in real videos and composite the result with the background clips. It lets us train a deep neural network operating on videos in an absence of a large annotated video dataset and provides ground-truth training-clip foreground optical flow for use in loss functions.

updated: Fri Sep 10 2021 12:53:11 GMT+0000 (UTC)

published: Fri Sep 10 2021 12:53:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト