Open-vocabulary Panoptic Segmentation with Embedding Modulation

Xi Chen; Shuang Li; Ser-Nam Lim; Antonio Torralba; Hengshuang Zhao

埋め込み変調によるオープン語彙パノプティックセグメンテーション

オープンボキャブラリー画像セグメンテーションは、現実世界での重要なアプリケーションのためにますます注目を集めています。従来の閉じた語彙のセグメンテーション方法は、新しいオブジェクトを特徴付けることができませんが、最近のいくつかの開いた語彙の試みでは、不十分な結果が得られます。つまり、閉じた語彙のパフォーマンスが著しく低下し、余分なデータが大量に要求されます。この目的のために、Open-vocabulary Panoptic Segmentation のための全能でデータ効率の良いフレームワークである OPSNet を提案します。具体的には、精巧に設計された Embedding Modulation モジュールは、いくつかの細心の注意を払ったコンポーネントとともに、セグメンテーションモデルと視覚言語的に適切に調整された CLIP エンコーダーとの間の適切な埋め込み強化と情報交換を可能にし、オープン語彙とクローズド語彙の両方で優れたセグメンテーションパフォーマンスを実現します。追加データの必要性がはるかに少ない設定。提案された OPSNet が最先端の結果を達成するさまざまな状況下で、複数のデータセット (COCO、ADE20K、Cityscapes、および PascalContext など) にわたって広範な実験的評価が行われ、提案されたアプローチの有効性と一般性が実証されます。コードとトレーニング済みモデルは公開されます。

Open-vocabulary image segmentation is attracting increasing attention due to its critical applications in the real world. Traditional closed-vocabulary segmentation methods are not able to characterize novel objects, whereas several recent open-vocabulary attempts obtain unsatisfactory results, i.e., notable performance reduction on the closed vocabulary and massive demand for extra data. To this end, we propose OPSNet, an omnipotent and data-efficient framework for Open-vocabulary Panoptic Segmentation. Specifically, the exquisitely designed Embedding Modulation module, together with several meticulous components, enables adequate embedding enhancement and information exchange between the segmentation model and the visual-linguistic well-aligned CLIP encoder, resulting in superior segmentation performance under both open- and closed-vocabulary settings with much fewer need of additional data. Extensive experimental evaluations are conducted across multiple datasets (e.g., COCO, ADE20K, Cityscapes, and PascalContext) under various circumstances, where the proposed OPSNet achieves state-of-the-art results, which demonstrates the effectiveness and generality of the proposed approach. The code and trained models will be made publicly available.

updated: Sat Jul 15 2023 11:04:26 GMT+0000 (UTC)

published: Mon Mar 20 2023 17:58:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト