TrackMPNN: A Message Passing Graph Neural Architecture for Multi-Object Tracking

Akshay Rangesh; Pranav Maheshwari; Mez Gebre; Siddhesh Mhatre; Vahid Ramezani; Mohan M. Trivedi

TrackMPNN：マルチオブジェクト追跡のためのメッセージパッシンググラフニューラルアーキテクチャ

この研究は、グラフベースのデータ構造を使用して問題をモデル化するマルチオブジェクトトラッキング（MOT）への以前の多くのアプローチに従い、この定式化を最新のニューラルネットワークに適合させるように適合させます。この作業における私たちの主な貢献は、複数のタイムステップにわたるデータ関連付けの問題を表す動的な無向グラフに基づくフレームワークの作成と、これらのグラフを操作してすべての関連付けに望ましい尤度を生成するメッセージパッシンググラフニューラルネットワーク（GNN）です。その中で。さらに、複数のタイムステップで推論し、以前の間違いを修正し、信念を更新し、長期記憶を所有し、処理できる、メモリ効率の高いリアルタイムのオンラインアルゴリズムを作成するために対処する必要がある計算問題の解決策と提案を提供します見逃した/誤検出。これに加えて、私たちのフレームワークは、操作する時間ウィンドウサイズとトレーニングに使用される損失の選択に柔軟性を提供します。本質的に、この研究は、教師あり学習からの従来の手法を使用してトレーニングされるあらゆる種類のグラフベースのニューラルネットワークのフレームワークを提供し、次にこれらのトレーニングされたモデルを使用して、オンライン、リアルタイム、計算上扱いやすい方法で新しいシーケンスを推測します。このアプローチの有効性と堅牢性を示すために、2Dボックスの場所とオブジェクトカテゴリのみを使用して、各オブジェクトインスタンスの記述子を作成します。それにもかかわらず、私たちのモデルは、複数の手作りおよび/または学習された機能を利用する最先端のアプローチと同等に機能します。自動運転の人気のあるMOTベンチマークに関する実験、定性的な例、および競争力のある結果は、提案されたアプローチの可能性と独自性を示しています。

This study follows many previous approaches to multi-object tracking (MOT) that model the problem using graph-based data structures, and adapts this formulation to make it amenable to modern neural networks. Our main contributions in this work are the creation of a framework based on dynamic undirected graphs that represent the data association problem over multiple timesteps, and a message passing graph neural network (GNN) that operates on these graphs to produce the desired likelihood for every association therein. We further provide solutions and propositions for the computational problems that need to be addressed to create a memory-efficient, real-time, online algorithm that can reason over multiple timesteps, correct previous mistakes, update beliefs, possess long-term memory, and handle missed/false detections. In addition to this, our framework provides flexibility in the choice of temporal window sizes to operate on and the losses used for training. In essence, this study provides a framework for any kind of graph based neural network to be trained using conventional techniques from supervised learning, and then use these trained models to infer on new sequences in an online, real-time, computationally tractable manner. To demonstrate the efficacy and robustness of our approach, we only use the 2D box location and object category to construct the descriptor for each object instance. Despite this, our model performs on par with state-of-the-art approaches that make use of multiple hand-crafted and/or learned features. Experiments, qualitative examples and competitive results on popular MOT benchmarks for autonomous driving demonstrate the promise and uniqueness of the proposed approach.

updated: Thu Jan 28 2021 19:59:10 GMT+0000 (UTC)

published: Mon Jan 11 2021 21:52:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト