Track Anything: Segment Anything Meets Videos

Jinyu Yang; Mingqi Gao; Zhe Li; Shang Gao; Fangjing Wang; Feng Zheng

あらゆるものを追跡: あらゆるものと動画をセグメント化

最近、Segment Anything Model (SAM) は、画像の優れたセグメンテーションパフォーマンスにより、急速に多くの注目を集めています。画像セグメンテーションの強力な機能とさまざまなプロンプトとの高いインタラクティブ性に関しては、ビデオの一貫したセグメンテーションではパフォーマンスが低いことがわかりました.そこで、本レポートでは、高性能なインタラクティブなトラッキングとビデオのセグメンテーションを実現する Track Anything Model (TAM) を提案します。詳細に説明すると、ビデオシーケンスが与えられた場合、人間の参加はほとんどなく、つまり数回クリックするだけで、興味のあるものは何でも追跡でき、1 回の推論で満足のいく結果を得ることができます。追加のトレーニングを行わなくても、このようなインタラクティブなデザインは、ビデオオブジェクトの追跡とセグメンテーションで優れたパフォーマンスを発揮します。すべてのリソースは https://github.com/gaomingqi/Track-Anything で入手できます。この作業が関連研究を促進することを願っています。

Recently, the Segment Anything Model (SAM) gains lots of attention rapidly due to its impressive segmentation performance on images. Regarding its strong ability on image segmentation and high interactivity with different prompts, we found that it performs poorly on consistent segmentation in videos. Therefore, in this report, we propose Track Anything Model (TAM), which achieves high-performance interactive tracking and segmentation in videos. To be detailed, given a video sequence, only with very little human participation, i.e., several clicks, people can track anything they are interested in, and get satisfactory results in one-pass inference. Without additional training, such an interactive design performs impressively on video object tracking and segmentation. All resources are available on https://github.com/gaomingqi/Track-Anything. We hope this work can facilitate related research.

updated: Fri Apr 28 2023 03:21:27 GMT+0000 (UTC)

published: Mon Apr 24 2023 10:04:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト