TMHOI: Translational Model for Human-Object Interaction Detection

Lijing Zhu; Qizhen Lan; Alvaro Velasquez; Houbing Song; Acharya Kamal; Qing Tian; Shuteng Niu

THOI: 人間と物体のインタラクション検出のためのトランスレーショナルモデル

人間とオブジェクトの相互作用 (HOI) を検出することは、コンピュータービジョンの分野における複雑な課題です。 HOI 検出の既存の方法は外観ベースの特徴に大きく依存していますが、これらは正確な検出に必要なすべての重要な特性を完全には捉えていない可能性があります。これらの課題を克服するために、TMGHOI (Translational Model for Human-Object Interaction Detection) と呼ばれる革新的なグラフベースのアプローチを提案します。私たちの方法は、空間知識と意味知識の両方を統合することにより、HOI の感情表現を効果的に捕捉します。 HOI をグラフとして表現することにより、インタラクションコンポーネントがノードとして機能し、それらの空間関係がエッジとして機能します。重要な空間情報と意味論的な情報を抽出するために、TMGHOI は個別の空間エンコーダーと意味論的エンコーダーを採用します。その後、これらのエンコーディングを組み合わせて、HOI のセンチメント表現を効果的にキャプチャするナレッジグラフを構築します。さらに、事前の知識を組み込む機能により、インタラクションの理解が深まり、検出精度がさらに向上します。 TMGHOI の有効性を実証するために、広く使用されている HICO-DET データセットに対して広範な評価を実施しました。私たちのアプローチは、既存の最先端のグラフベースの手法を大幅に上回り、HOI 検出の優れたソリューションとしての可能性を示しました。私たちは、TMGHOI が HOI 検出の精度と効率を大幅に向上させる可能性を秘めていると確信しています。空間知識と意味知識が統合されており、計算効率と実用性も備えているため、コンピュータービジョンコミュニティの研究者や実務者にとって貴重なツールとなっています。他の研究と同様に、私たちは、提案した方法の一般化可能性と堅牢性を確立するために、さまざまなデータセットでのさらなる調査と評価の重要性を認識しています。

Detecting human-object interactions (HOIs) is an intricate challenge in the field of computer vision. Existing methods for HOI detection heavily rely on appearance-based features, but these may not fully capture all the essential characteristics necessary for accurate detection. To overcome these challenges, we propose an innovative graph-based approach called TMGHOI (Translational Model for Human-Object Interaction Detection). Our method effectively captures the sentiment representation of HOIs by integrating both spatial and semantic knowledge. By representing HOIs as a graph, where the interaction components serve as nodes and their spatial relationships as edges. To extract crucial spatial and semantic information, TMGHOI employs separate spatial and semantic encoders. Subsequently, these encodings are combined to construct a knowledge graph that effectively captures the sentiment representation of HOIs. Additionally, the ability to incorporate prior knowledge enhances the understanding of interactions, further boosting detection accuracy. We conducted extensive evaluations on the widely-used HICO-DET datasets to demonstrate the effectiveness of TMGHOI. Our approach outperformed existing state-of-the-art graph-based methods by a significant margin, showcasing its potential as a superior solution for HOI detection. We are confident that TMGHOI has the potential to significantly improve the accuracy and efficiency of HOI detection. Its integration of spatial and semantic knowledge, along with its computational efficiency and practicality, makes it a valuable tool for researchers and practitioners in the computer vision community. As with any research, we acknowledge the importance of further exploration and evaluation on various datasets to establish the generalizability and robustness of our proposed method.

updated: Sat Jul 01 2023 15:44:42 GMT+0000 (UTC)

published: Tue Mar 07 2023 21:52:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト