SlowFast Networks for Video Recognition

Christoph Feichtenhofer; Haoqi Fan; Jitendra Malik; Kaiming He

ビデオ認識用のSlowFastネットワーク

ビデオ認識用のSlowFastネットワークを紹介します。モデルには、（i）空間セマンティクスをキャプチャするための低フレームレートで動作する低速経路と、（ii）細かい時間解像度でモーションをキャプチャするための高フレームレートで動作する高速経路が含まれます。高速経路は、チャネル容量を減らすことで非常に軽量にできますが、ビデオ認識に役立つ時間情報を学習できます。私たちのモデルは、ビデオのアクション分類と検出の両方で強力なパフォーマンスを達成し、SlowFastコンセプトによる貢献として大きな改善が特定されています。主要なビデオ認識ベンチマーク、キネティクス、シャレード、AVAの最新の精度を報告します。コードは、https：//github.com/facebookresearch/SlowFastで入手できます。

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution. The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition. Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report state-of-the-art accuracy on major video recognition benchmarks, Kinetics, Charades and AVA. Code has been made available at: https://github.com/facebookresearch/SlowFast

updated: Tue Oct 29 2019 06:26:37 GMT+0000 (UTC)

published: Mon Dec 10 2018 18:59:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト