DAVOS: Semi-Supervised Video Object Segmentation via Adversarial Domain Adaptation

Jinshuo Zhang; Zhicheng Wang; Songyan Zhang; Gang Wei

DAVOS：敵対的ドメイン適応による半教師ありビデオオブジェクトセグメンテーション

ドメインシフトは、ビデオオブジェクトセグメンテーション（VOS）の主要な問題の1つであり、見慣れないデータセットでテストするとモデルが劣化します。最近、通常不足しているテストデータの注釈を微調整することにより、トレーニングデータ（ソースドメイン）とテストデータ（ターゲットドメイン）の間のパフォーマンスギャップを狭めるための多くのオンライン方法が登場しました。この論文では、最初にVOSタスクに敵対的ドメイン適応を導入することにより、ドメインシフトに取り組む新しい方法を提案します。ソースドメインでの教師ありトレーニングとターゲットドメインでの教師なしトレーニングを使用します。外観とモーション機能を畳み込みレイヤーと融合し、モーションブランチに監視を追加することで、モデルは、教師ありトレーニング後の平均IoUスコアが82.6％で、DAVIS2016で最先端のパフォーマンスを実現します。一方、敵対的なドメイン適応戦略は、追加の注釈を利用することなく、FBMS59およびYoutube-Objectに適用された場合に、トレーニングされたモデルのパフォーマンスを大幅に向上させます。

Domain shift has always been one of the primary issues in video object segmentation (VOS), for which models suffer from degeneration when tested on unfamiliar datasets. Recently, many online methods have emerged to narrow the performance gap between training data (source domain) and test data (target domain) by fine-tuning on annotations of test data which are usually in shortage. In this paper, we propose a novel method to tackle domain shift by first introducing adversarial domain adaptation to the VOS task, with supervised training on the source domain and unsupervised training on the target domain. By fusing appearance and motion features with a convolution layer, and by adding supervision onto the motion branch, our model achieves state-of-the-art performance on DAVIS2016 with 82.6% mean IoU score after supervised training. Meanwhile, our adversarial domain adaptation strategy significantly raises the performance of the trained model when applied on FBMS59 and Youtube-Object, without exploiting extra annotations.

updated: Mon May 24 2021 16:00:12 GMT+0000 (UTC)

published: Fri May 21 2021 08:23:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト