Discovering Human-Object Interaction Concepts via Self-Compositional Learning

Zhi Hou; Baosheng Yu; Dacheng Tao

自己構成学習による人間と物体の相互作用の概念の発見

人間と物体の相互作用（HOI）を包括的に理解するには、事前定義されたHOI概念（またはカテゴリ）のごく一部だけでなく、他の合理的なHOI概念も検出する必要がありますが、現在のアプローチでは通常、未知のHOI概念の大部分を探索できません（つまり、動詞とオブジェクトの未知であるが合理的な組み合わせ）。このホワイトペーパーでは、1）HOIコンセプトディスカバリーと呼ばれる、HOIを包括的に理解するための斬新でやりがいのあるタスクを紹介します。 2）HOIの概念を発見するための自己構成学習フレームワーク（またはSCL）を考案します。具体的には、トレーニング中にオンラインで更新された概念信頼マトリックスを維持します。1）自己トレーニングの概念信頼マトリックスに従ってすべての複合HOIインスタンスに疑似ラベルを割り当てます。 2）すべての複合HOIインスタンスの予測を使用して、概念信頼行列を更新します。したがって、提案された方法は、既知および未知の両方のHOI概念の学習を可能にします。 HOIの概念の発見、オブジェクトのアフォーダンスの認識、およびHOIの検出のために提案された方法の有効性を実証するために、いくつかの一般的なHOIデータセットに対して広範な実験を実行します。たとえば、提案された自己構成学習フレームワークは、1）HOI概念の発見のパフォーマンスをHICO-DETでそれぞれ10％以上、V-COCOで3％以上大幅に改善します。 2）MS-COCOおよびHICO-DETでの9％を超えるmAPによるオブジェクトアフォーダンスの認識。 3）レアファーストおよび非レアファーストの未知のHOI検出は、それぞれ比較的30％および20％を超えています。コードはhttps://github.com/zhihou7/HOI-CLで公開されています。

A comprehensive understanding of human-object interaction (HOI) requires detecting not only a small portion of predefined HOI concepts (or categories) but also other reasonable HOI concepts, while current approaches usually fail to explore a huge portion of unknown HOI concepts (i.e., unknown but reasonable combinations of verbs and objects). In this paper, 1) we introduce a novel and challenging task for a comprehensive HOI understanding, which is termed as HOI Concept Discovery; and 2) we devise a self-compositional learning framework (or SCL) for HOI concept discovery. Specifically, we maintain an online updated concept confidence matrix during training: 1) we assign pseudo-labels for all composite HOI instances according to the concept confidence matrix for self-training; and 2) we update the concept confidence matrix using the predictions of all composite HOI instances. Therefore, the proposed method enables the learning on both known and unknown HOI concepts. We perform extensive experiments on several popular HOI datasets to demonstrate the effectiveness of the proposed method for HOI concept discovery, object affordance recognition and HOI detection. For example, the proposed self-compositional learning framework significantly improves the performance of 1) HOI concept discovery by over 10% on HICO-DET and over 3% on V-COCO, respectively; 2) object affordance recognition by over 9% mAP on MS-COCO and HICO-DET; and 3) rare-first and non-rare-first unknown HOI detection relatively over 30% and 20%, respectively. Code is publicly available at https://github.com/zhihou7/HOI-CL.

updated: Sun Jul 24 2022 05:43:33 GMT+0000 (UTC)

published: Sun Mar 27 2022 10:31:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト