Open-Vocabulary Image Segmentation

Golnaz Ghiasi; Xiuye Gu; Yin Cui; Tsung-Yi Lin

オープンボキャブラリー画像セグメンテーション

画像を任意のテキストで示される意味のある領域に編成するために、オープンボキャブラリー画像セグメンテーションモデルを設計します。最近のオープンボキャブラリーモデルは、画像に何が含まれているかを認識しているにもかかわらず、視覚的な概念をうまくローカライズできないことを確認しています。これらのモデルは、視覚的意味論的アラインメントを学習する前にピクセルをグループに編成する視覚的グループ化の重要なステップを見逃していると主張します。上記の問題に対処するためにOpenSegを提案します。最初に、可能な組織のためにセグメンテーションマスクを提案することを学びます。次に、キャプション内の各単語を1つまたはいくつかの予測マスクに位置合わせすることにより、視覚的意味的位置合わせを学習します。マスク表現は、キャプションからの学習をサポートするための鍵であり、データセットと語彙のサイズをスケールアップすることを可能にします。私たちの仕事は、ホールドアウトセグメンテーションデータセットでゼロショット転送を実行する最初のものです。クラスアクティベーションマップを適用するか、事前にトレーニングされたALIGNモデルにピクセル単位のラベルを使用して微調整することにより、2つの強力なベースラインを設定しました。 OpenSegは、PASCAL-Context（459クラス）で3.4 mIoU、ADE-20k（847クラス）で2.7mIoUだけこれらのベースラインを上回っています。

We design an open-vocabulary image segmentation model to organize an image into meaningful regions indicated by arbitrary texts. We identify that recent open-vocabulary models can not localize visual concepts well despite recognizing what are in an image. We argue that these models miss an important step of visual grouping, which organizes pixels into groups before learning visual-semantic alignments. We propose OpenSeg to address the above issue. First, it learns to propose segmentation masks for possible organizations. Then it learns visual-semantic alignments by aligning each word in a caption to one or a few predicted masks. We find the mask representations are the key to support learning from captions, making it possible to scale up the dataset and vocabulary sizes. Our work is the first to perform zero-shot transfer on holdout segmentation datasets. We set up two strong baselines by applying class activation maps or fine-tuning with pixel-wise labels on a pre-trained ALIGN model. OpenSeg outperforms these baselines by 3.4 mIoU on PASCAL-Context (459 classes) and 2.7 mIoU on ADE-20k (847 classes).

updated: Wed Dec 22 2021 18:57:54 GMT+0000 (UTC)

published: Wed Dec 22 2021 18:57:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト