Pedestrian Intention Prediction: A Multi-task Perspective

Smail Ait Bouhsain; Saeed Saadatnejad; Alexandre Alahi

歩行者の意図予測：マルチタスクの視点

自動運転車がグローバルに展開されるためには、歩行者の安全を保証する必要があります。これが、歩行者の意図を十分に前もって予測することが、自動運転車にとって最も重要で困難なタスクの1つである理由です。この作品は、歩行者の意図と視覚状態を共同で予測することにより、この問題を解決しようとしています。視覚的な状態に関しては、以前の作業ではxy座標に焦点が当てられていましたが、歩行者のサイズと実際には境界ボックス全体も予測します。この方法は、マルチタスク学習アプローチにおけるリカレントニューラルネットワークです。将来の位置ごとに歩行者の意図を予測するヘッドと、歩行者の視覚状態を予測するヘッドがあります。 JAADデータセットでの実験は、意図予測のための以前の研究と比較して、私たちの方法のパフォーマンスの優位性を示しています。また、その単純なアーキテクチャ（2倍以上高速）ですが、バウンディングボックス予測のパフォーマンスは、はるかに複雑なアーキテクチャによって得られるパフォーマンスに匹敵します。私たちのコードはオンラインで入手できます。

In order to be globally deployed, autonomous cars must guarantee the safety of pedestrians. This is the reason why forecasting pedestrians' intentions sufficiently in advance is one of the most critical and challenging tasks for autonomous vehicles. This work tries to solve this problem by jointly predicting the intention and visual states of pedestrians. In terms of visual states, whereas previous work focused on x-y coordinates, we will also predict the size and indeed the whole bounding box of the pedestrian. The method is a recurrent neural network in a multi-task learning approach. It has one head that predicts the intention of the pedestrian for each one of its future position and another one predicting the visual states of the pedestrian. Experiments on the JAAD dataset show the superiority of the performance of our method compared to previous works for intention prediction. Also, although its simple architecture (more than 2 times faster), the performance of the bounding box prediction is comparable to the ones yielded by much more complex architectures. Our code is available online.

updated: Thu May 20 2021 11:14:35 GMT+0000 (UTC)

published: Tue Oct 20 2020 13:42:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト