Normalized Convolution Upsampling for Refined Optical Flow Estimation

Abdelrahman Eldesokey; Michael Felsberg

洗練されたオプティカルフロー推定のための正規化された畳み込みアップサンプリング

オプティカルフローは、畳み込みニューラルネットワーク（CNN）が大きなブレークスルーをもたらした回帰タスクです。ただし、これは、コストボリュームとピラミッド表現を使用するため、大きな計算上の要求が発生します。これは、4分の1の解像度でフロー予測を生成することで軽減され、テスト時間中に双一次内挿を使用してアップサンプリングされます。その結果、通常、細かい詳細は失われ、それらを復元するには後処理が必要になります。オプティカルフローCNNのトレーニング中にフル解像度のフローを生成するための効率的なジョイントアップサンプリングアプローチである正規化畳み込みUPsampler（NCUP）を提案します。私たちが提案するアプローチは、アップサンプリングタスクをスパース問題として定式化し、正規化された畳み込みニューラルネットワークを使用してそれを解決します。粗いオプティカルフローから細かいオプティカルフローCNN（PWCNet）でエンドツーエンドでトレーニングした場合、既存のジョイントアップサンプリングアプローチに対してアップサンプラーを評価し、FlyingChairsデータセットの他のすべてのアプローチよりも優れている一方で、パラメーターが少なくとも1桁少ないことを示します。。さらに、リカレントオプティカルフローCNN（RAFT）を使用してアップサンプラーをテストし、Sintelベンチマークで最大6％のエラー削減、およびKITTIデータセットと同等の最新の結果を7.5％削減して達成しました。パラメータ（図1を参照）。最後に、さまざまなデータセットでトレーニングおよび評価した場合、アップサンプラーはRAFTよりも優れた一般化機能を示します。

Optical flow is a regression task where convolutional neural networks (CNNs) have led to major breakthroughs. However, this comes at major computational demands due to the use of cost-volumes and pyramidal representations. This was mitigated by producing flow predictions at quarter the resolution, which are upsampled using bilinear interpolation during test time. Consequently, fine details are usually lost and post-processing is needed to restore them. We propose the Normalized Convolution UPsampler (NCUP), an efficient joint upsampling approach to produce the full-resolution flow during the training of optical flow CNNs. Our proposed approach formulates the upsampling task as a sparse problem and employs the normalized convolutional neural networks to solve it. We evaluate our upsampler against existing joint upsampling approaches when trained end-to-end with a a coarse-to-fine optical flow CNN (PWCNet) and we show that it outperforms all other approaches on the FlyingChairs dataset while having at least one order fewer parameters. Moreover, we test our upsampler with a recurrent optical flow CNN (RAFT) and we achieve state-of-the-art results on Sintel benchmark with ~6% error reduction, and on-par on the KITTI dataset, while having 7.5% fewer parameters (see Figure 1). Finally, our upsampler shows better generalization capabilities than RAFT when trained and evaluated on different datasets.

updated: Sat Feb 13 2021 18:34:03 GMT+0000 (UTC)

published: Sat Feb 13 2021 18:34:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト