Self-Supervised Depth Estimation Based on Camera Models

Jinchang Zhang; Praveen Kumar Reddy; Xue-Iuan Wong; Guoyu Lu

Depth estimationn is a critical topic for robotics and vision-related tasks. In monocular depth estimation, in comparison with supervised learning that requires expensive ground truth labeling, self-supervised methods possess great potential due to no labeling cost. However, self-supervised learning still has a large gap with supervised learning in depth estimation performance. Meanwhile, scaling is also a major issue for monocular unsupervised depth estimation, which commonly still needs ground truth scale from GPS, LiDAR, or existing maps to correct. In deep learning era, while existing methods mainly rely on the exploration of image relationships to train the unsupervised neural networks, fundamental information provided by the camera itself has been generally ignored, which can provide extensive supervision information for free, without the need for any extra equipment to provide supervision signals. Utilizing the camera itself's intrinsics and extrinsics, depth information can be calculated for ground regions and regions connecting ground based on physical principles, providing free supervision information without any other sensors. The method is easy to realize and can be a component to enhance the effects of all the unsupervised methods.

updated: Fri Aug 02 2024 20:40:19 GMT+0000 (UTC)

published: Fri Aug 02 2024 20:40:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト