Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos

Daizong Liu; Pan Zhou

ビデオにおける一時的な文のローカリゼーションのための視覚的および意味論的認識グラフメモリネットワークの共同使用

ビデオの一時的な文のローカリゼーション (TSLV) は、特定の文のクエリに従って、トリミングされていないビデオで最も関心のあるセグメントを取得することを目的としています。ただし、既存の TSLV アプローチのほとんどは、同じ制限に悩まされています。 (2) 豊富なセマンティックコンテキストを活用して、クエリの推論にさらに利益をもたらすことを怠っています。これらの問題に対処するために、このホワイトペーパーでは、オブジェクトレベルからフレームレベルまでの視覚的および意味論的認識の両方のクエリ推論を可能にする、新しい階層型の視覚的および意味論的認識ネットワーク (HVSARN) を提案します。具体的には、ビジュアルセマンティッククエリ推論を実行するための新しいグラフメモリメカニズムを提示します。視覚推論のために、ビデオの視覚情報を活用するビジュアルグラフメモリを設計します。セマンティック推論では、セマンティックグラフメモリも導入され、ビデオオブジェクトのクラスと属性に含まれるセマンティック知識を明示的に活用し、セマンティック空間で相関推論を実行します。 3 つのデータセットでの実験は、HVSARN が新しい最先端のパフォーマンスを達成することを示しています。

Temporal sentence localization in videos (TSLV) aims to retrieve the most interested segment in an untrimmed video according to a given sentence query. However, almost of existing TSLV approaches suffer from the same limitations: (1) They only focus on either frame-level or object-level visual representation learning and corresponding correlation reasoning, but fail to integrate them both; (2) They neglect to leverage the rich semantic contexts to further benefit the query reasoning. To address these issues, in this paper, we propose a novel Hierarchical Visual- and Semantic-Aware Reasoning Network (HVSARN), which enables both visual- and semantic-aware query reasoning from object-level to frame-level. Specifically, we present a new graph memory mechanism to perform visual-semantic query reasoning: For visual reasoning, we design a visual graph memory to leverage visual information of video; For semantic reasoning, a semantic graph memory is also introduced to explicitly leverage semantic knowledge contained in the classes and attributes of video objects, and perform correlation reasoning in the semantic space. Experiments on three datasets demonstrate that our HVSARN achieves a new state-of-the-art performance.

updated: Wed Mar 15 2023 03:10:39 GMT+0000 (UTC)

published: Thu Mar 02 2023 08:00:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト