MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos

Mathias Parger; Chengcheng Tang; Thomas Neff; Christopher D. Twigg; Cem Keskin; Robert Wang; Markus Steinberger

MotionDeltaCNN: 移動カメラビデオにおけるフレーム差分のスパース CNN 推論

ビデオ入力に対する畳み込みニューラルネットワーク推論は計算コストが高く、高いメモリ帯域幅を必要とします。最近、DeltaCNN は、前のフレームで大幅に更新されたピクセルのみを処理することでコストを削減することに成功しました。ただし、DeltaCNN は静的なカメラ入力に依存しています。カメラが移動すると、メモリのオーバーヘッドを増やすことなく、また将来のフレームのカメラの外部要素を知ることなく、新たに公開された画像領域をすでに処理された領域と効率的に融合して更新速度を最小限に抑える方法に新たな課題が加わります。この研究では、移動カメラをサポートするスパース CNN 推論フレームワークである MotionDeltaCNN を提案します。球状バッファーとパディング畳み込みを導入して、メモリフットプリントを増やすことなく、新たに公開された領域と以前に処理された領域のシームレスな融合を可能にします。私たちの評価では、移動カメラビデオに関しては DeltaCNN よりも最大 90% 優れていることがわかりました。

Convolutional neural network inference on video input is computationally expensive and requires high memory bandwidth. Recently, DeltaCNN managed to reduce the cost by only processing pixels with significant updates over the previous frame. However, DeltaCNN relies on static camera input. Moving cameras add new challenges in how to fuse newly unveiled image regions with already processed regions efficiently to minimize the update rate - without increasing memory overhead and without knowing the camera extrinsics of future frames. In this work, we propose MotionDeltaCNN, a sparse CNN inference framework that supports moving cameras. We introduce spherical buffers and padded convolutions to enable seamless fusion of newly unveiled regions and previously processed regions -- without increasing memory footprint. Our evaluation shows that we outperform DeltaCNN by up to 90% for moving camera videos.

updated: Mon Aug 14 2023 20:24:24 GMT+0000 (UTC)

published: Tue Oct 18 2022 14:23:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト