Artificial Dummies for Urban Dataset Augmentation

Antonín Vobecký; David Hurych; Michal Uřičář; Patrick Pérez; Josef Šivic

都市データセット拡張のための人工ダミー

画像で歩行者検出器をトレーニングするための既存のデータセットは、外観とポーズのばらつきが限られています。最も困難なシナリオが含まれることはめったにありません。安全上の理由からキャプチャするのが難しすぎるか、発生する可能性が非常に低いためです。支援型および自動運転アプリケーションの厳格な安全要件により、これらのまれな状況でも非常に高い検出精度が求められます。任意のポーズ、任意の外観、さまざまな照明や気象条件のさまざまな背景シーンに埋め込まれた人物画像を生成する機能を持つことは、このようなアプリケーションの開発とテストにとって重要なコンポーネントです。この論文の貢献は3つあります。最初に、人を含む都市のシーンを制御して合成し、まれな、または見たことのない状況を作り出すための拡張方法について説明します。これは、ポーズ、外観、およびターゲットの背景シーンを解きほぐした制御を備えたデータジェネレーター（DummyNetと呼ばれる）を使用して実現されます。第二に、提案されたジェネレータは、前景の人物のセグメンテーションと背景シーンへのその構成を考慮に入れた、新しいネットワークアーキテクチャと関連する損失に依存しています。最後に、DummyNetによって生成されたデータが、さまざまなデータセット全体で、また限られた量のトレーニングデータしか利用できない夜間の条件などの困難な状況で、いくつかの既存の人物検出器のパフォーマンスを向上させることを示します。利用可能な日中のデータのみを使用するセットアップでは、日中のデータのみでトレーニングされた検出器よりも、夜間の検出器の対数平均ミス率が17％向上します。

Existing datasets for training pedestrian detectors in images suffer from limited appearance and pose variation. The most challenging scenarios are rarely included because they are too difficult to capture due to safety reasons, or they are very unlikely to happen. The strict safety requirements in assisted and autonomous driving applications call for an extra high detection accuracy also in these rare situations. Having the ability to generate people images in arbitrary poses, with arbitrary appearances and embedded in different background scenes with varying illumination and weather conditions, is a crucial component for the development and testing of such applications. The contributions of this paper are three-fold. First, we describe an augmentation method for controlled synthesis of urban scenes containing people, thus producing rare or never-seen situations. This is achieved with a data generator (called DummyNet) with disentangled control of the pose, the appearance, and the target background scene. Second, the proposed generator relies on novel network architecture and associated loss that takes into account the segmentation of the foreground person and its composition into the background scene. Finally, we demonstrate that the data generated by our DummyNet improve performance of several existing person detectors across various datasets as well as in challenging situations, such as night-time conditions, where only a limited amount of training data is available. In the setup with only day-time data available, we improve the night-time detector by 17% log-average miss rate over the detector trained with the day-time data only.

updated: Tue Dec 15 2020 13:17:25 GMT+0000 (UTC)

published: Tue Dec 15 2020 13:17:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト