FlowFormer: A Transformer Architecture for Optical Flow

Zhaoyang Huang; Xiaoyu Shi; Chao Zhang; Qiang Wang; Ka Chun Cheung; Hongwei Qin; Jifeng Dai; Hongsheng Li

FlowFormer: オプティカルフローのトランスフォーマーアーキテクチャ

オプティカルフローを学習するためのトランスフォーマーベースのニューラルネットワークアーキテクチャである、FlowFormer と呼ばれるオプティカルフロートランスフォーマーを紹介します。 FlowFormer は、画像ペアから構築された 4D コストボリュームをトークン化し、コストトークンを新しい潜在空間の代替群変換器 (AGT) レイヤーを使用してコストメモリにエンコードし、動的な位置コストクエリを使用して再帰型変換デコーダを介してコストメモリをデコードします。 . Sintel ベンチマークでは、FlowFormer はクリーンパスと最終パスで 1.159 と 2.088 の平均エンドポニットエラー (AEPE) を達成し、公開されている最良の結果 (1.388 と 2.47) から 16.5% と 15.5% のエラー削減を達成しました。さらに、FlowFormer は強力な一般化パフォーマンスも実現します。 Sintel でトレーニングされていない場合、FlowFormer は、Sintel トレーニングセットのクリーンな最終パスで 0.64 および 1.50 AEPE を達成し、公開されている最高の結果 (1.29 および 2.74) を 50.4% および 45.3% 上回っています。

We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural network architecture for learning optical flow. FlowFormer tokenizes the 4D cost volume built from an image pair, encodes the cost tokens into a cost memory with alternate-group transformer (AGT) layers in a novel latent space, and decodes the cost memory via a recurrent transformer decoder with dynamic positional cost queries. On the Sintel benchmark, FlowFormer achieves 1.159 and 2.088 average end-ponit-error (AEPE) on the clean and final pass, a 16.5% and 15.5% error reduction from the best published result (1.388 and 2.47). Besides, FlowFormer also achieves strong generalization performance. Without being trained on Sintel, FlowFormer achieves 0.64 and 1.50 AEPE on the clean and final pass of Sintel training set, outperforming the best published result (1.29 and 2.74) by 50.4% and 45.3%.

updated: Mon Aug 29 2022 11:23:51 GMT+0000 (UTC)

published: Wed Mar 30 2022 10:33:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト