Evaluating Transformers for Lightweight Action Recognition

Raivo Koot; Markus Hennerbichler; Haiping Lu

軽量アクション認識のためのトランスフォーマーの評価

ビデオアクション認識では、トランスフォーマーは常に最先端の精度に到達します。ただし、多くのモデルは、ハードウェアリソースが限られている平均的な研究者には重すぎます。この作業では、軽量アクション認識のためのビデオトランスフォーマーの制限を探ります。 3つの大規模データセットと10のハードウェアデバイスにわたって、13のビデオトランスフォーマーとベースラインのベンチマークを行います。私たちの研究は、複数のデバイスにわたるアクション認識モデルの効率を詳細に評価し、同じ条件下でさまざまなビデオトランスフォーマーをトレーニングする最初の研究です。現在のメソッドを3つのクラスに分類し、畳み込みバックボーンを拡張する複合トランスフォーマーが、精度に欠けているにもかかわらず、軽量のアクション認識に最適であることを示します。一方、アテンションオンリーモデルにはより多くのモーションモデリング機能が必要であり、スタンドアロンのアテンションブロックモデルでは現在、レイテンシのオーバーヘッドが大きすぎます。私たちの実験では、現在のビデオトランスフォーマーは、従来の畳み込みベースラインと同等の軽量アクション認識をまだ実行できず、このギャップを埋めるために前述の欠点に対処する必要があると結論付けています。実験を再現するためのコードは公開されます。

In video action recognition, transformers consistently reach state-of-the-art accuracy. However, many models are too heavyweight for the average researcher with limited hardware resources. In this work, we explore the limitations of video transformers for lightweight action recognition. We benchmark 13 video transformers and baselines across 3 large-scale datasets and 10 hardware devices. Our study is the first to evaluate the efficiency of action recognition models in depth across multiple devices and train a wide range of video transformers under the same conditions. We categorize current methods into three classes and show that composite transformers that augment convolutional backbones are best at lightweight action recognition, despite lacking accuracy. Meanwhile, attention-only models need more motion modeling capabilities and stand-alone attention block models currently incur too much latency overhead. Our experiments conclude that current video transformers are not yet capable of lightweight action recognition on par with traditional convolutional baselines, and that the previously mentioned shortcomings need to be addressed to bridge this gap. Code to reproduce our experiments will be made publicly available.

updated: Tue Dec 07 2021 21:12:24 GMT+0000 (UTC)

published: Thu Nov 18 2021 11:45:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト