Segmenting Transparent Object in the Wild with Transformer

Enze Xie; Wenjia Wang; Wenhai Wang; Peize Sun; Hang Xu; Ding Liang; Ping Luo

Transformerを使用して野生の透明なオブジェクトをセグメント化する

この作品は、Trans10K-v2と呼ばれる新しいきめの細かい透明オブジェクトセグメンテーションデータセットを提示し、最初の大規模な透明オブジェクトセグメンテーションデータセットであるTrans10K-v1を拡張します。 2つの限定されたカテゴリしかないTrans10K-v1とは異なり、新しいデータセットにはいくつかの魅力的な利点があります。（1）人間の家庭環境で一般的に発生する、11のきめ細かい透明オブジェクトのカテゴリがあり、実際のアプリケーションでより実用的です。（2）Trans10K-v2は、以前のバージョンよりも現在の高度なセグメンテーション方法に多くの課題をもたらします。さらに、Trans2Segと呼ばれる新しい変圧器ベースのセグメンテーションパイプラインが提案されています。まず、Trans2Segのトランスフォーマーエンコーダーは、純粋なCNNアーキテクチャよりも優れた利点を示すCNNのローカル受容野とは対照的に、グローバル受容野を提供します。次に、セマンティックセグメンテーションを辞書検索の問題として定式化することにより、学習可能なプロトタイプのセットをTrans2Segのトランスフォーマーデコーダーのクエリとして設計します。各プロトタイプは、データセット全体の1つのカテゴリの統計を学習します。 20を超える最近のセマンティックセグメンテーションメソッドのベンチマークを行い、Trans2SegがすべてのCNNベースのメソッドを大幅に上回っていることを示し、提案されたアルゴリズムが透過的なオブジェクトセグメンテーションを解決する潜在的な能力を示しています。

This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment, making it more practical for real-world application. (2) Trans10K-v2 brings more challenges for the current advanced segmentation methods than its former version. Furthermore, a novel transformer-based segmentation pipeline termed Trans2Seg is proposed. Firstly, the transformer encoder of Trans2Seg provides the global receptive field in contrast to CNN's local receptive field, which shows excellent advantages over pure CNN architectures. Secondly, by formulating semantic segmentation as a problem of dictionary look-up, we design a set of learnable prototypes as the query of Trans2Seg's transformer decoder, where each prototype learns the statistics of one category in the whole dataset. We benchmark more than 20 recent semantic segmentation methods, demonstrating that Trans2Seg significantly outperforms all the CNN-based methods, showing the proposed algorithm's potential ability to solve transparent object segmentation.

updated: Thu Jan 21 2021 06:41:00 GMT+0000 (UTC)

published: Thu Jan 21 2021 06:41:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト