Matting Anything

Jiachen Li; Jitesh Jain; Humphrey Shi

何でもマット化

Matting Anything

この論文では、柔軟でインタラクティブな視覚的または言語的なユーザープロンプトガイダンスを使用して、画像内の任意のインスタンスのアルファマットを推定するための効率的で汎用性の高いフレームワークである、Matting Anything Model (MAM) を提案します。 MAM は、以前の特殊なイメージマッティングネットワークに比べて、いくつかの重要な利点を提供します。(i) MAM は、セマンティック、インスタンス、参照イメージマッティングを含むさまざまなタイプのイメージマッティングを単一のモデルだけで処理できます。 (ii) MAM は、Segment Anything Model (SAM) の特徴マップを活用し、軽量の Mask-to-Matte (M2M) モジュールを採用して、反復改良を通じてアルファマットを予測します。このモジュールには、トレーニング可能なパラメーターが 270 万しかありません。 (iii) SAM を組み込むことにより、MAM は、トライマップからボックス、ポイント、またはテキストプロンプトまでのイメージマッティングのインタラクティブな使用に必要なユーザー介入を簡素化します。さまざまな画像マット化ベンチマークで MAM のパフォーマンスを評価しました。実験結果は、MAM が各ベンチマークのさまざまな指標の下で最先端の特殊な画像マット化モデルと同等のパフォーマンスを達成することを示しています。全体として、MAM は優れた汎化能力を示し、より少ないパラメータでさまざまな画像マッティングタスクを効果的に処理できるため、統一された画像マッティングのための実用的なソリューションとなります。私たちのコードとモデルは、https://github.com/SHI-Labs/Matting-Anything でオープンソース化されています。

In this paper, we propose the Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance. MAM offers several significant advantages over previous specialized image matting networks: (i) MAM is capable of dealing with various types of image matting, including semantic, instance, and referring image matting with only a single model; (ii) MAM leverages the feature maps from the Segment Anything Model (SAM) and adopts a lightweight Mask-to-Matte (M2M) module to predict the alpha matte through iterative refinement, which has only 2.7 million trainable parameters. (iii) By incorporating SAM, MAM simplifies the user intervention required for the interactive use of image matting from the trimap to the box, point, or text prompt. We evaluate the performance of MAM on various image matting benchmarks, and the experimental results demonstrate that MAM achieves comparable performance to the state-of-the-art specialized image matting models under different metrics on each benchmark. Overall, MAM shows superior generalization ability and can effectively handle various image matting tasks with fewer parameters, making it a practical solution for unified image matting. Our code and models are open-sourced at https://github.com/SHI-Labs/Matting-Anything.

updated: Thu Nov 16 2023 23:52:37 GMT+0000 (UTC)

published: Thu Jun 08 2023 17:51:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト