Two-level Data Augmentation for Calibrated Multi-view Detection

Martin Engilberge; Haixin Shi; Zhiye Wang; Pascal Fua

キャリブレーションされたマルチビュー検出のための 2 レベルのデータ拡張

データ拡張は、モデルの一般化とパフォーマンスを改善するのに役立つことが証明されています。マルチビューシステムに関しては、コンピュータービジョンアプリケーションで一般的に適用されますが、ほとんど使用されません。実際、幾何学的データの拡張により、ビュー間の調整が崩れる可能性があります。マルチビューデータは不足する傾向があり、注釈を付けるにはコストがかかるため、これは問題です。この作業では、ビュー間のアライメントを維持する新しいマルチビューデータ拡張パイプラインを導入することで、この問題を解決することを提案します。入力画像の従来の拡張に加えて、シーンレベルで直接適用される第 2 レベルの拡張も提案します。シンプルなマルチビュー検出モデルと組み合わせると、2 レベルの拡張パイプラインは、2 つの主要なマルチビューの複数人物検出データセット WILDTRACK と MultiviewX で既存のすべてのベースラインを大幅に上回ります。

Data augmentation has proven its usefulness to improve model generalization and performance. While it is commonly applied in computer vision application when it comes to multi-view systems, it is rarely used. Indeed geometric data augmentation can break the alignment among views. This is problematic since multi-view data tend to be scarce and it is expensive to annotate. In this work we propose to solve this issue by introducing a new multi-view data augmentation pipeline that preserves alignment among views. Additionally to traditional augmentation of the input image we also propose a second level of augmentation applied directly at the scene level. When combined with our simple multi-view detection model, our two-level augmentation pipeline outperforms all existing baselines by a significant margin on the two main multi-view multi-person detection datasets WILDTRACK and MultiviewX.

updated: Wed Oct 19 2022 17:55:13 GMT+0000 (UTC)

published: Wed Oct 19 2022 17:55:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト