Neural Camera Models

Igor Vasiljevic

ニューラルカメラモデル

現代のコンピュータービジョンは、インターネットの写真コレクションの領域を超えて物理的な世界に移行し、カメラを搭載したロボットや自動運転車を構造化されていない環境で誘導しています。これらの具現化されたエージェントが現実世界のオブジェクトとやり取りできるようにするために、カメラは深さセンサーとしてますます使用され、さまざまな下流の推論タスクのために環境を再構築します。機械学習を利用した深度認識 (深度推定) は、画像内の各ピクセルについて、画像化されたシーンポイントまでの距離を予測します。深度推定は目覚ましい進歩を遂げましたが、重要な課題が残っています。(1) グラウンドトゥルースの深度ラベルを大規模に収集するのは難しく、費用もかかります。(2) カメラ情報は通常、既知であると想定されていますが、多くの場合、信頼性が低く、(3)実際にはさまざまな種類のカメラやレンズが使用されていますが、限定的なカメラの仮定が一般的です。この論文では、これらの仮定を緩和することに焦点を当て、カメラを真に汎用的な深度センサーに変えるという最終目標への貢献について説明します。

Modern computer vision has moved beyond the domain of internet photo collections and into the physical world, guiding camera-equipped robots and autonomous cars through unstructured environments. To enable these embodied agents to interact with real-world objects, cameras are increasingly being used as depth sensors, reconstructing the environment for a variety of downstream reasoning tasks. Machine-learning-aided depth perception, or depth estimation, predicts for each pixel in an image the distance to the imaged scene point. While impressive strides have been made in depth estimation, significant challenges remain: (1) ground truth depth labels are difficult and expensive to collect at scale, (2) camera information is typically assumed to be known, but is often unreliable and (3) restrictive camera assumptions are common, even though a great variety of camera types and lenses are used in practice. In this thesis, we focus on relaxing these assumptions, and describe contributions toward the ultimate goal of turning cameras into truly generic depth sensors.

updated: Sat Aug 27 2022 01:28:46 GMT+0000 (UTC)

published: Sat Aug 27 2022 01:28:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト