OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

Cheng Tan; Siyuan Li; Zhangyang Gao; Wenfei Guan; Zedong Wang; Zicheng Liu; Lirong Wu; Stan Z. Li

OpenSTL: 時空間予測学習の包括的なベンチマーク

時空間予測学習は、教師なしの方法で特定の過去のフレームから将来のフレームを予測することにより、モデルが空間的および時間的パターンを学習できるようにする学習パラダイムです。近年の目覚ましい進歩にもかかわらず、多様な設定、複雑な実装、および困難な再現性のために、体系的な理解が依然として不足しています。標準化がなければ、比較が不公平になり、決定的な洞察が得られない可能性があります。このジレンマに対処するために、私たちは、一般的なアプローチをリカレントベースのモデルとリカレントフリーのモデルに分類する、時空間予測学習の包括的なベンチマークである OpenSTL を提案します。 OpenSTL は、さまざまな最先端のメソッドを実装したモジュール式の拡張可能なフレームワークを提供します。合成移動体の軌跡、人の動き、走行シーン、交通流、天気予報など、さまざまな領域のデータセットに対して標準的な評価を実施します。私たちの観察に基づいて、モデルアーキテクチャとデータセットのプロパティが時空間予測学習のパフォーマンスにどのような影響を与えるかについて詳細な分析を提供します。驚くべきことに、リカレントフリーモデルはリカレントモデルよりも効率とパフォーマンスのバランスが取れていることがわかりました。したがって、私たちは一般的な MetaFormer をさらに拡張して、再帰性のない時空間予測学習を強化します。コードとモデルは https://github.com/chengtan9907/OpenSTL でオープンソース化されています。

Spatio-temporal predictive learning is a learning paradigm that enables models to learn spatial and temporal patterns by predicting future frames from given past frames in an unsupervised manner. Despite remarkable progress in recent years, a lack of systematic understanding persists due to the diverse settings, complex implementation, and difficult reproducibility. Without standardization, comparisons can be unfair and insights inconclusive. To address this dilemma, we propose OpenSTL, a comprehensive benchmark for spatio-temporal predictive learning that categorizes prevalent approaches into recurrent-based and recurrent-free models. OpenSTL provides a modular and extensible framework implementing various state-of-the-art methods. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and weather forecasting. Based on our observations, we provide a detailed analysis of how model architecture and dataset properties affect spatio-temporal predictive learning performance. Surprisingly, we find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models. Thus, we further extend the common MetaFormers to boost recurrent-free spatial-temporal predictive learning. We open-source the code and models at https://github.com/chengtan9907/OpenSTL.

updated: Wed Oct 18 2023 00:02:52 GMT+0000 (UTC)

published: Tue Jun 20 2023 03:02:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト