MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive Learning

Zhifeng Ma; Hao Zhang; Jie Liu

MS-RNN: 時空間予測学習のための柔軟なマルチスケールフレームワーク

深層学習を用いて過去の事前知識から未来のフレームを予測する時空間予測学習は、多くの分野で広く利用されています。以前の研究では、ネットワークを広げたり深めたりすることでモデルのパフォーマンスが本質的に改善されましたが、メモリのオーバーヘッドが急増し、このテクノロジの開発と適用が深刻に妨げられていました。メモリ消費を増やさずにパフォーマンスを向上させるために、スケールに注目します。これは、モデルのパフォーマンスを向上させるための別の次元ですが、メモリ要件は低くなります。有効性は、画像分類やセマンティックセグメンテーションなどの多くの CNN ベースのタスクで広く実証されていますが、最近の RNN モデルでは十分に調査されていません。この論文では、マルチスケールの利点から学習して、マルチスケール RNN (MS-RNN) という名前の一般的なフレームワークを提案し、時空間予測学習用の最近の RNN モデルを強化します。異なるスケールを統合することで、既存のモデルを強化し、パフォーマンスを向上させ、オーバーヘッドを大幅に削減します。 MS-RNN フレームワークは、4 つの異なるデータセット (Moving MNIST、TaxiBJ、KTH、および Germany) で 8 つの一般的な RNN モデル (ConvLSTM、TrajGRU、PredRNN、PredRNN++、MIM、MotionRNN、PredRNN-V2、および PrecipLSTM) を使用した徹底的な実験によって検証されます。 .この結果は、私たちのフレームワークを組み込んだ RNN モデルが以前よりもメモリコストがはるかに低く、パフォーマンスが優れていることを示しています。コードは https://github.com/mazhf/MS-RNN で公開されています。

Spatiotemporal predictive learning, which predicts future frames through historical prior knowledge with the aid of deep learning, is widely used in many fields. Previous work essentially improves the model performance by widening or deepening the network, but it also brings surging memory overhead, which seriously hinders the development and application of this technology. In order to improve the performance without increasing memory consumption, we focus on scale, which is another dimension to improve model performance but with low memory requirement. The effectiveness has been widely demonstrated in many CNN-based tasks such as image classification and semantic segmentation, but it has not been fully explored in recent RNN models. In this paper, learning from the benefit of multi-scale, we propose a general framework named Multi-Scale RNN (MS-RNN) to boost recent RNN models for spatiotemporal predictive learning. By integrating different scales, we enhance the existing models with both improved performance and greatly reduced overhead. We verify the MS-RNN framework by exhaustive experiments with eight popular RNN models (ConvLSTM, TrajGRU, PredRNN, PredRNN++, MIM, MotionRNN, PredRNN-V2, and PrecipLSTM) on four different datasets (Moving MNIST, TaxiBJ, KTH, and Germany). The results show the efficiency that the RNN models incorporating our framework have much lower memory cost but better performance than before. Our code is released at https://github.com/mazhf/MS-RNN.

updated: Fri Apr 28 2023 02:44:47 GMT+0000 (UTC)

published: Tue Jun 07 2022 04:57:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト