Ego-Vehicle Action Recognition based on Semi-Supervised Contrastive Learning

Chihiro Noguchi; Toshihiro Tanizawa

半教師あり対照学習に基づく自車両行動認識

近年、多くの自動車にカメラが搭載され、走行シーンの膨大な映像が蓄積されています。自動運転には最高レベルの安全性が求められるため、想像を絶するような稀な運転シーンでも学習データとして収集し、特定のシーンの認識精度を向上させる必要があります。ただし、膨大な量のビデオから特定のシーンをごくわずかに見つけるのは非常にコストがかかります。この記事では、エゴビークルのアクションに焦点を当てることで、適切なビデオ間の距離を定義できることを示します。教師あり学習に基づく既存の方法は、事前定義されたクラスに分類されないビデオを処理できないことはよく知られていますが、ラベル付きビデオ間の埋め込みスペースでビデオ間の距離を定義する際にはうまく機能します。この問題に取り組むために、半教師あり対照学習に基づく方法を提案します。 2 つの関連するが異なる対照学習を検討します。標準グラフ対照学習と、提案された SOIA ベースの対照学習です。後者のアプローチは、ラベル付けされていないビデオ間のより適切なビデオ間の距離を提供できることがわかります。次に、HDD データセットを使用した自車両行動認識の分類性能を評価することにより、この方法の有効性を定量化します。これは、トレーニングにラベルなしデータを含める方法が、トレーニングにラベル付きデータのみを使用する既存の方法よりも大幅に優れていることを示しています。

In recent years, many automobiles have been equipped with cameras, which have accumulated an enormous amount of video footage of driving scenes. Autonomous driving demands the highest level of safety, for which even unimaginably rare driving scenes have to be collected in training data to improve the recognition accuracy for specific scenes. However, it is prohibitively costly to find very few specific scenes from an enormous amount of videos. In this article, we show that proper video-to-video distances can be defined by focusing on ego-vehicle actions. It is well known that existing methods based on supervised learning cannot handle videos that do not fall into predefined classes, though they work well in defining video-to-video distances in the embedding space between labeled videos. To tackle this problem, we propose a method based on semi-supervised contrastive learning. We consider two related but distinct contrastive learning: standard graph contrastive learning and our proposed SOIA-based contrastive learning. We observe that the latter approach can provide more sensible video-to-video distances between unlabeled videos. Next, the effectiveness of our method is quantified by evaluating the classification performance of the ego-vehicle action recognition using HDD dataset, which shows that our method including unlabeled data in training significantly outperforms the existing methods using only labeled data in training.

updated: Thu Mar 02 2023 05:19:31 GMT+0000 (UTC)

published: Thu Mar 02 2023 05:19:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト