A Comprehensive Survey of Scene Graphs: Generation and Application

Xiaojun Chang; Pengzhen Ren; Pengfei Xu; Zhihui Li; Xiaojiang Chen; Alex Hauptmann

シーングラフの包括的な調査：生成と適用

シーングラフは、シーンの構造化された表現であり、シーン内のオブジェクト、属性、およびオブジェクト間の関係を明確に表現できます。コンピュータビジョン技術が発展し続けるにつれて、人々はもはや画像内の物体を単に検出して認識することに満足していません。代わりに、人々は視覚シーンについてのより高いレベルの理解と推論を楽しみにしています。たとえば、画像が与えられた場合、画像内のオブジェクトを検出して認識するだけでなく、オブジェクト間の関係（視覚的関係の検出）を知り、画像の内容に基づいてテキストの説明（画像のキャプション）を生成する必要があります。または、画像内の小さな女の子が何をしているのかをマシンに知らせたり（Visual Question Answering（VQA））、画像から犬を削除して類似の画像を見つけたり（画像の編集と取得）することもできます。タスクには、画像ビジョンタスクのより高いレベルの理解と推論が必要です。シーングラフは、シーンを理解するための非常に強力なツールです。そのため、シーングラフは多くの研究者の注目を集めており、関連する研究はクロスモーダルで複雑であり、急速に発展していることがよくあります。しかし、現在、シーングラフの比較的体系的な調査は存在しません。この目的のために、この調査は現在のシーングラフ研究の包括的な調査を実施します。具体的には、まずシーングラフの一般的な定義をまとめ、次に事前知識を活用してシーングラフ（SGG）とSGGの生成方法について包括的かつ体系的に議論しました。次に、シーングラフの主なアプリケーションを調査し、最も一般的に使用されるデータセットを要約しました。最後に、シーングラフの将来の開発に関するいくつかの洞察を提供します。これは、シーングラフに関する将来の研究にとって非常に役立つ基盤になると信じています。

Scene graph is a structured representation of a scene that can clearly express the objects, attributes, and relationships between objects in the scene. As computer vision technology continues to develop, people are no longer satisfied with simply detecting and recognizing objects in images; instead, people look forward to a higher level of understanding and reasoning about visual scenes. For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content. Alternatively, we might want the machine to tell us what the little girl in the image is doing (Visual Question Answering (VQA)), or even remove the dog from the image and find similar images (image editing and retrieval), etc. These tasks require a higher level of understanding and reasoning for image vision tasks. The scene graph is just such a powerful tool for scene understanding. Therefore, scene graphs have attracted the attention of a large number of researchers, and related research is often cross-modal, complex, and rapidly developing. However, no relatively systematic survey of scene graphs exists at present. To this end, this survey conducts a comprehensive investigation of the current scene graph research. More specifically, we first summarized the general definition of the scene graph, then conducted a comprehensive and systematic discussion on the generation method of the scene graph (SGG) and the SGG with the aid of prior knowledge. We then investigated the main applications of scene graphs and summarized the most commonly used datasets. Finally, we provide some insights into the future development of scene graphs. We believe this will be a very helpful foundation for future research on scene graphs.

updated: Fri Jan 07 2022 01:35:21 GMT+0000 (UTC)

published: Wed Mar 17 2021 04:24:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト