BlindSpotNet: Seeing Where We Cannot See

Taichi Fukuda; Kotaro Hasegawa; Shinya Ishizaki; Shohei Nobuhara; Ko Nishino

BlindSpotNet：見えないところを見る

道路シーンを理解するための重要な視覚的タスクとして、2D死角推定を紹介します。車両の見晴らしの良い場所から遮られている道路領域を自動的に検出することで、手動運転者や自動運転システムに事故の潜在的な原因を事前に警告できます（たとえば、子供が飛び出す可能性のある道路領域に注意を向けることができます）。車にLiDARが搭載されている場合でも、その場で3D推論を行うと、非常に費用がかかり、エラーが発生しやすくなるため、完全な3Dで死角を検出することは困難です。代わりに、単眼カメラから2Dで死角を推定する方法を学ぶことを提案します。これは2つのステップで実現します。最初に、単眼深度推定、セマンティックセグメンテーション、およびSLAMを活用して、任意の運転ビデオの「グラウンドトゥルース」死角トレーニングデータを生成する自動方法を紹介します。重要なアイデアは、現在は見えないが近い将来に見えるようになる道路領域として死角を定義することにより、3Dではなく2D画像から推論することです。この自動オフライン死角推定を使用して大規模なデータセットを構築します。これをRoadBlindSpot（RBS）データセットと呼びます。次に、このデータセットを完全に活用して、任意の運転ビデオのフレームごとの死角確率マップを完全に自動推定するシンプルなネットワークであるBlindSpotNet（BSN）を紹介します。広範な実験結果は、RBSデータセットの有効性とBSNの有効性を示しています。

We introduce 2D blind spot estimation as a critical visual task for road scene understanding. By automatically detecting road regions that are occluded from the vehicle's vantage point, we can proactively alert a manual driver or a self-driving system to potential causes of accidents (e.g., draw attention to a road region from which a child may spring out). Detecting blind spots in full 3D would be challenging, as 3D reasoning on the fly even if the car is equipped with LiDAR would be prohibitively expensive and error prone. We instead propose to learn to estimate blind spots in 2D, just from a monocular camera. We achieve this in two steps. We first introduce an automatic method for generating ``ground-truth'' blind spot training data for arbitrary driving videos by leveraging monocular depth estimation, semantic segmentation, and SLAM. The key idea is to reason in 3D but from 2D images by defining blind spots as those road regions that are currently invisible but become visible in the near future. We construct a large-scale dataset with this automatic offline blind spot estimation, which we refer to as Road Blind Spot (RBS) dataset. Next, we introduce BlindSpotNet (BSN), a simple network that fully leverages this dataset for fully automatic estimation of frame-wise blind spot probability maps for arbitrary driving videos. Extensive experimental results demonstrate the validity of our RBS Dataset and the effectiveness of our BSN.

updated: Fri Jul 08 2022 12:54:18 GMT+0000 (UTC)

published: Fri Jul 08 2022 12:54:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト