ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: Semi-Supervised Video Object Segmentation

Jiahao Li; Yuanyou Xu; Zongxin Yang; Yi Yang; Yueting Zhuang

EPIC-KITCHEN Challenge 2023 への ZJU ReLER の提出: 半教師ありビデオオブジェクトセグメンテーション

Associating Objects with Transformers (AOT) フレームワークは、ビデオオブジェクトセグメンテーションの幅広い複雑なシナリオで優れたパフォーマンスを示しています。この研究では、複数の機能スケールで変圧器を組み込んだ AOT シリーズのバリエーションである MSDeAOT を紹介します。 MSDeAOT は、階層型ゲート伝播モジュール (GPM) を利用して、ストライド 16 の特徴スケールを使用して、前のフレームから現在のフレームにオブジェクトマスクを効率的に伝播します。さらに、ストライド 8 のより洗練された特徴スケールで GPM を採用しています。小さな物体の検出と追跡の精度が向上します。テスト時の拡張とモデルアンサンブル技術の実装を通じて、EPIC-KITCHEN VISOR 半教師ありビデオオブジェクトセグメンテーションチャレンジでトップランクの地位を獲得しました。

The Associating Objects with Transformers (AOT) framework has exhibited exceptional performance in a wide range of complex scenarios for video object segmentation. In this study, we introduce MSDeAOT, a variant of the AOT series that incorporates transformers at multiple feature scales. Leveraging the hierarchical Gated Propagation Module (GPM), MSDeAOT efficiently propagates object masks from previous frames to the current frame using a feature scale with a stride of 16. Additionally, we employ GPM in a more refined feature scale with a stride of 8, leading to improved accuracy in detecting and tracking small objects. Through the implementation of test-time augmentations and model ensemble techniques, we achieve the top-ranking position in the EPIC-KITCHEN VISOR Semi-supervised Video Object Segmentation Challenge.

updated: Mon Jul 10 2023 09:20:29 GMT+0000 (UTC)

published: Wed Jul 05 2023 03:43:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト