Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting

Dooseop Choi; KyoungWook Min

マルチモーダル車両軌道予測のための階層的潜在構造

変分オートエンコーダー（VAE）は、理論的にエレガントで、トレーニングが簡単で、多様体の表現が優れているため、データ分布のモデリングに広く利用されています。ただし、画像の再構成および合成タスクに適用すると、VAEは、生成されたサンプルがぼやける傾向があるという制限を示します。生成された軌道が隣接する車線の間にあるという同様の問題が、VAEベースの軌道予測モデルでしばしば発生することを観察します。この問題を軽減するために、VAEベースの予測モデルに階層的な潜在構造を導入します。軌道分布は単純な分布（またはモード）の混合として近似できるという仮定に基づいて、低レベルの潜在変数を使用して混合の各モードをモデル化し、高レベルの潜在変数を使用して重みを表しますモードについて。各モードを正確にモデル化するために、新しい方法で計算された2つの車線レベルのコンテキストベクトルを使用して低レベルの潜在変数を調整します。1つは車線の相互作用に対応し、もう1つは車と車の相互作用に対応します。コンテキストベクトルは、提案されたモード選択ネットワークを介して重みをモデル化するためにも使用されます。予測モデルを評価するために、2つの大規模な実世界のデータセットを使用します。実験結果は、私たちのモデルが明確なマルチモーダル軌道分布を生成できるだけでなく、予測精度の点で最先端（SOTA）モデルよりも優れていることを示しています。私たちのコードはhttps://github.com/d1024choi/HLSTrajForecastで入手できます。

Variational autoencoder (VAE) has widely been utilized for modeling data distributions because it is theoretically elegant, easy to train, and has nice manifold representations. However, when applied to image reconstruction and synthesis tasks, VAE shows the limitation that the generated sample tends to be blurry. We observe that a similar problem, in which the generated trajectory is located between adjacent lanes, often arises in VAE-based trajectory forecasting models. To mitigate this problem, we introduce a hierarchical latent structure into the VAE-based forecasting model. Based on the assumption that the trajectory distribution can be approximated as a mixture of simple distributions (or modes), the low-level latent variable is employed to model each mode of the mixture and the high-level latent variable is employed to represent the weights for the modes. To model each mode accurately, we condition the low-level latent variable using two lane-level context vectors computed in novel ways, one corresponds to vehicle-lane interaction and the other to vehicle-vehicle interaction. The context vectors are also used to model the weights via the proposed mode selection network. To evaluate our forecasting model, we use two large-scale real-world datasets. Experimental results show that our model is not only capable of generating clear multi-modal trajectory distributions but also outperforms the state-of-the-art (SOTA) models in terms of prediction accuracy. Our code is available at https://github.com/d1024choi/HLSTrajForecast.

updated: Mon Jul 11 2022 04:52:28 GMT+0000 (UTC)

published: Mon Jul 11 2022 04:52:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト