Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

Wenqi Ren; Yang Tang; Qiyu Sun; Chaoqiang Zhao; Qing-Long Han

少数/ゼロショット学習に基づく視覚的セマンティックセグメンテーション: 概要

ビジュアルセマンティックセグメンテーションは、視覚サンプルを特定のセマンティック属性を持つ多様なブロックに分離し、各ブロックのカテゴリを識別することを目的としており、環境認識において重要な役割を果たします。従来の学習ベースのビジュアルセマンティックセグメンテーションアプローチは、高密度の注釈を含む大規模なトレーニングデータに大きく依存しており、目に見えないカテゴリの正確なセマンティックラベルを常に推定できませんでした。この障害は、少数/ゼロショット学習の助けを借りて視覚的セマンティックセグメンテーションを研究する熱狂に拍車をかけます。少数/ゼロショットビジュアルセマンティックセグメンテーションの出現と急速な進歩により、少数のラベル付きサンプルまたはゼロラベル付きサンプルから目に見えないカテゴリを学習できるようになり、実用的なアプリケーションへの拡張が進みます。したがって、このホワイトペーパーでは、最近公開された、2D 空間から 3D 空間までさまざまな数/ゼロショットの視覚的セマンティックセグメンテーション方法に焦点を当て、さまざまなセグメンテーション環境下での技術的な和解の共通点と不一致を探ります。具体的には、問題の定義、典型的なデータセット、および技術的な救済策を含む、少数/ゼロショットの視覚的セマンティックセグメンテーションに関する準備について簡単にレビューし、説明します。さらに、画像セマンティックセグメンテーション、ビデオオブジェクトセグメンテーション、および 3D セグメンテーションを含む、ビジュアルセマンティックセグメンテーションと少数/ゼロショット学習の相互作用を明らかにするために、3 つの典型的なインスタンス化が関与します。最後に、少数/ゼロショットビジュアルセマンティックセグメンテーションの将来の課題について説明します。

Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block, and it plays a crucial role in environmental perception. Conventional learning-based visual semantic segmentation approaches count heavily on large-scale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories. This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning. The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen-category from a few labeled or zero-labeled samples, which advances the extension to practical applications. Therefore, this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances. Specifically, the preliminaries on few/zero-shot visual semantic segmentation, including the problem definitions, typical datasets, and technical remedies, are briefly reviewed and discussed. Moreover, three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation, including image semantic segmentation, video object segmentation, and 3D segmentation. Finally, the future challenges of few/zero-shot visual semantic segmentation are discussed.

updated: Sun Nov 13 2022 13:39:33 GMT+0000 (UTC)

published: Sun Nov 13 2022 13:39:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト