Fully-Convolutional Siamese Networks for Object Tracking

Luca Bertinetto; Jack Valmadre; João F. Henriques; Andrea Vedaldi; Philip H. S. Torr

物体追跡のための完全畳み込みシャムネットワーク

任意の物体追跡の問題は、伝統的に、ビデオ自体を唯一の訓練データとして使用して、物体の外観のモデルをオンラインで排他的に学習することによって取り組まれてきた。これらの方法は成功しているが、オンラインのみのアプローチでは、学習できるモデルの豊富さに限界がある。最近では、深層畳み込みネットワークの表現力を利用する試みがいくつか行われている。しかし、追跡する対象が事前にわからない場合、ネットワークの重みを適応させるために確率勾配降下法をオンラインで実行する必要があり、システムの速度が著しく低下する。本論文では、映像中の物体検出のために、ILSVRC15データセット上でエンドツーエンドで訓練された新しい完全畳み込みシャムネットワークを用いて、基本的なトラッキングアルゴリズムを実装する。我々のトラッカーはリアルタイムを超えるフレームレートで動作し、非常にシンプルであるにもかかわらず、複数のベンチマークで最先端の性能を達成している。

The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object's appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves state-of-the-art performance in multiple benchmarks.

updated: Wed Dec 01 2021 19:21:43 GMT+0000 (UTC)

published: Thu Jun 30 2016 16:00:43 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト