Probabilistic Debiasing of Scene Graphs

Bashirul Azam Biswas; Qiang Ji

シーングラフの確率的バイアス緩和

最先端 (SOTA) モデルによって生成されたシーングラフの品質は、リレーションシップとその親オブジェクトペアのロングテールの性質により損なわれます。シーングラフのトレーニングは、多数ペアの多数関係によって支配されるため、少数ペアの関係のオブジェクト条件分布は、トレーニングが収束した後は保存されません。その結果、偏ったモデルは、「on」や「wearing」などの関係の限界分布におけるより頻繁な関係ではうまく機能し、「食べる」や「ぶら下がっている」などの頻度の低い関係ではうまく機能しません。この作業では、関係ラベルのオブジェクト条件付き分布を維持し、関係の限界確率によって作成されるバイアスを根絶するために、トリプレットベイジアンネットワーク (BN) 内に組み込まれた仮想証拠を提案します。少数派クラスの関係の数が不十分であることは、トリプレット内ベイジアンネットワークを学習する際に重大な問題を引き起こします。セマンティック空間の近隣のトリプレットからマイノリティトリプレットクラスのサンプルを借用する、トリプレットの埋め込みベースの拡張によって、この不十分さに対処します。 2 つの異なるデータセットで実験を行い、関係の平均再現率を大幅に改善しました。また、シーングラフモデルの SOTA デバイアス手法と比較して、再現率と平均再現率のパフォーマンスのバランスを改善しています。

The quality of scene graphs generated by the state-of-the-art (SOTA) models is compromised due to the long-tail nature of the relationships and their parent object pairs. Training of the scene graphs is dominated by the majority relationships of the majority pairs and, therefore, the object-conditional distributions of relationship in the minority pairs are not preserved after the training is converged. Consequently, the biased model performs well on more frequent relationships in the marginal distribution of relationships such as `on' and `wearing', and performs poorly on the less frequent relationships such as `eating' or `hanging from'. In this work, we propose virtual evidence incorporated within-triplet Bayesian Network (BN) to preserve the object-conditional distribution of the relationship label and to eradicate the bias created by the marginal probability of the relationships. The insufficient number of relationships in the minority classes poses a significant problem in learning the within-triplet Bayesian network. We address this insufficiency by embedding-based augmentation of triplets where we borrow samples of the minority triplet classes from its neighborhood triplets in the semantic space. We perform experiments on two different datasets and achieve a significant improvement in the mean recall of the relationships. We also achieve better balance between recall and mean recall performance compared to the SOTA de-biasing techniques of scene graph models.

updated: Tue Mar 14 2023 21:07:45 GMT+0000 (UTC)

published: Fri Nov 11 2022 19:06:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト