Robust Lane Detection through Self Pre-training with Masked Sequential Autoencoders and Fine-tuning with Customized PolyLoss

Ruohan Li; Yongqi Dong

マスクされたシーケンシャルオートエンコーダーによる自己事前トレーニングとカスタマイズされた PolyLoss による微調整による堅牢な車線検出

車線検出は車両の位置特定に不可欠であり、自動運転や多くのインテリジェントで高度な運転支援システムの基盤となります。利用可能な視覚ベースの車線検出方法では、貴重な機能や集約されたコンテキスト情報、特に車線の境界線と連続フレーム内の画像の他の領域との相互関係が十分に活用されていません。この研究ギャップを埋め、車線検出パフォーマンスをアップグレードするために、この論文では、マスクされたシーケンシャルオートエンコーダによる自己事前トレーニングと、複数の連続画像フレームを使用するエンドツーエンドニューラルネットワークモデルのカスタマイズされた PolyLoss による微調整で構成されるパイプラインを提案します。マスクされたシーケンシャルオートエンコーダは、ランダムなマスクされた画像から欠落したピクセルを再構築することを目的として、ニューラルネットワークモデルを事前トレーニングするために採用されています。次に、車線検出セグメンテーションが実行される微調整セグメンテーションフェーズでは、連続画像フレームが入力として提供され、事前トレーニングされたモデルの重みが転送され、カスタマイズされた PolyLoss でバックプロパゲーションメカニズムを使用してモデル間の重み付け誤差を計算してさらに更新されます。出力レーン検出結果とラベル付けされたグラウンドトゥルース。広範な実験結果は、提案されたパイプラインを使用すると、通常のシーンと困難なシーンの両方で車線検出モデルのパフォーマンスを最先端を超えて進化させることができ、最高のテスト精度 (98.38%)、精度 (0.937)、通常のシーンテストセットでの F1 測定 (0.924) と、困難なシーンテストセットでの全体的な精度 (98.36%) および精度 (0.844) が最高であり、トレーニング時間を大幅に短縮できます。

Lane detection is crucial for vehicle localization which makes it the foundation for automated driving and many intelligent and advanced driving assistant systems. Available vision-based lane detection methods do not make full use of the valuable features and aggregate contextual information, especially the interrelationships between lane lines and other regions of the images in continuous frames. To fill this research gap and upgrade lane detection performance, this paper proposes a pipeline consisting of self pre-training with masked sequential autoencoders and fine-tuning with customized PolyLoss for the end-to-end neural network models using multi-continuous image frames. The masked sequential autoencoders are adopted to pre-train the neural network models with reconstructing the missing pixels from a random masked image as the objective. Then, in the fine-tuning segmentation phase where lane detection segmentation is performed, the continuous image frames are served as the inputs, and the pre-trained model weights are transferred and further updated using the backpropagation mechanism with customized PolyLoss calculating the weighted errors between the output lane detection results and the labeled ground truth. Extensive experiment results demonstrate that, with the proposed pipeline, the lane detection model performance on both normal and challenging scenes can be advanced beyond the state-of-the-art, delivering the best testing accuracy (98.38%), precision (0.937), and F1-measure (0.924) on the normal scene testing set, together with the best overall accuracy (98.36%) and precision (0.844) in the challenging scene test set, while the training time can be substantially shortened.

updated: Fri Aug 11 2023 08:35:06 GMT+0000 (UTC)

published: Fri May 26 2023 21:36:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト