Surround-View Cameras based Holistic Visual Perception for Automated Driving

Varun Ravi Kumar

自動運転のためのサラウンドビューカメラベースのホリスティック視覚認識

目の形成は進化のビッグバンにつながりました。ダイナミクスは、視覚センサーによって求められている食物を食べるために食物が接触するのを待っている原始的な生物から変化しました。人間の目は進化の最も洗練された発展の1つですが、それでも欠陥があります。人間は、何百万年もの間、車の運転、機械の操作、航空機の操縦、船の操縦が可能な生物学的知覚アルゴリズムを進化させてきました。コンピューターのこれらの機能を自動化することは、自動運転車、拡張現実、建築測量など、さまざまなアプリケーションにとって重要です。自動運転車のコンテキストでの近接場の視覚認識は、0〜10メートルの範囲で車両の周囲を360°カバーする環境を認識できます。これは、より安全な自動運転の開発における重要な意思決定コンポーネントです。カメラやLiDARなどの高品質センサーと組み合わせたコンピュータービジョンとディープラーニングの最近の進歩により、成熟した視覚認識ソリューションが促進されています。これまで、遠方界の知覚が主な焦点でした。もう1つの重要な問題は、リアルタイムアプリケーションの開発に利用できる処理能力が限られていることです。このボトルネックのため、パフォーマンスと実行時の効率の間にはしばしばトレードオフがあります。これらに対処するために、次の問題に焦点を当てます。1）畳み込みニューラルネットワークを使用した幾何学的および意味論的タスクなどのさまざまな視覚的タスクのための高性能で計算の複雑さが低い近距離知覚アルゴリズムの開発。 2）マルチタスク学習を使用して、タスク間で初期畳み込み層を共有し、タスクのバランスをとる最適化戦略を開発することにより、計算上のボトルネックを克服します。

The formation of eyes led to the big bang of evolution. The dynamics changed from a primitive organism waiting for the food to come into contact for eating food being sought after by visual sensors. The human eye is one of the most sophisticated developments of evolution, but it still has defects. Humans have evolved a biological perception algorithm capable of driving cars, operating machinery, piloting aircraft, and navigating ships over millions of years. Automating these capabilities for computers is critical for various applications, including self-driving cars, augmented reality, and architectural surveying. Near-field visual perception in the context of self-driving cars can perceive the environment in a range of 0-10 meters and 360° coverage around the vehicle. It is a critical decision-making component in the development of safer automated driving. Recent advances in computer vision and deep learning, in conjunction with high-quality sensors such as cameras and LiDARs, have fueled mature visual perception solutions. Until now, far-field perception has been the primary focus. Another significant issue is the limited processing power available for developing real-time applications. Because of this bottleneck, there is frequently a trade-off between performance and run-time efficiency. We concentrate on the following issues in order to address them: 1) Developing near-field perception algorithms with high performance and low computational complexity for various visual perception tasks such as geometric and semantic tasks using convolutional neural networks. 2) Using Multi-Task Learning to overcome computational bottlenecks by sharing initial convolutional layers between tasks and developing optimization strategies that balance tasks.

updated: Sat Jun 11 2022 14:51:30 GMT+0000 (UTC)

published: Sat Jun 11 2022 14:51:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト