Video Object Segmentation with Episodic Graph Memory Networks

Xiankai Lu; Wenguan Wang; Martin Danelljan; Tianfei Zhou; Jianbing Shen; Luc Van Gool

エピソードグラフメモリネットワークによるビデオオブジェクトのセグメンテーション

セグメンテーションモデルを特定のビデオやオンラインターゲットの外観の変化に効率的に適応させる方法は、ビデオオブジェクトのセグメンテーションの分野で根本的に重要な問題です。この作業では、「セグメンテーションモデルを更新する学習」という新しいアイデアに対処するために、グラフメモリネットワークが開発されます。具体的には、完全に接続されたグラフとして編成されたエピソードメモリネットワークを活用して、フレームをノードとして保存し、エッジによるフレーム間の相関をキャプチャします。さらに、学習可能なコントローラが組み込まれているため、メモリの読み取りと書き込みが簡単になり、固定メモリスケールが維持されます。構造化された外部メモリ設計により、視覚情報が限られている場合でも、モデルで新しい知識を包括的にマイニングしてすばやく保存できます。また、微分可能メモリコントローラは、有用な表現をメモリに保存する抽象的な方法と、これらの表現を後で予測に使用する方法をゆっくりと学習します、勾配降下法による。さらに、提案されたグラフメモリネットワークは、ワンショットとゼロショットの両方のビデオオブジェクトセグメンテーションタスクを一般化できる、きちんとした原理のフレームワークを提供します。 4つの困難なベンチマークデータセットでの広範囲な実験により、グラフメモリネットワークがケースバイケースのビデオオブジェクトセグメンテーションのためのセグメンテーションネットワークの適応を促進できることを確認します。

How to make a segmentation model efficiently adapt to a specific video and to online target appearance variations are fundamentally crucial issues in the field of video object segmentation. In this work, a graph memory network is developed to address the novel idea of "learning to update the segmentation model". Specifically, we exploit an episodic memory network, organized as a fully connected graph, to store frames as nodes and capture cross-frame correlations by edges. Further, learnable controllers are embedded to ease memory reading and writing, as well as maintain a fixed memory scale. The structured, external memory design enables our model to comprehensively mine and quickly store new knowledge, even with limited visual information, and the differentiable memory controllers slowly learn an abstract method for storing useful representations in the memory and how to later use these representations for prediction, via gradient descent. In addition, the proposed graph memory network yields a neat yet principled framework, which can generalize well both one-shot and zero-shot video object segmentation tasks. Extensive experiments on four challenging benchmark datasets verify that our graph memory network is able to facilitate the adaptation of the segmentation network for case-by-case video object segmentation.

updated: Wed Dec 09 2020 09:58:23 GMT+0000 (UTC)

published: Tue Jul 14 2020 13:19:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト