IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture

J. Lorenzo; I. Parra; M. A. Sotelo

IntFormer：Transformerアーキテクチャを使用して歩行者の意図を予測する

横断歩道の行動を理解することは、インテリジェントな車両開発において不可欠な目標であり、セキュリティと交通流の改善につながります。この論文では、IntFormerと呼ばれるメソッドを開発しました。これは、トランスフォーマーアーキテクチャとRubiksNetと呼ばれる新しい畳み込みビデオ分類モデルに基づいています。最近のベンチマークでの評価手順に従って、モデルが優れたパフォーマンス（毎秒約40シーケンス）とサイズ（最高のパフォーマンスモデルより8倍小さい）で最先端の結果に到達し、適切であることを示しています。リアルタイムで使用できます。また、各入力特徴を調べて、自我車両の速度が最も重要な変数であることを発見しました。これは、おそらくPIEデータセットの交差するケースの類似性が原因です。

Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that our model reaches state-of-the-art results with good performance (≈40 seq. per second) and size (8×smaller than the best performing model), making it suitable for real-time usage. We also explore each of the input features, finding that ego-vehicle speed is the most important variable, possibly due to the similarity in crossing cases in PIE dataset.

updated: Tue May 18 2021 16:23:15 GMT+0000 (UTC)

published: Tue May 18 2021 16:23:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト