A Neurally-Inspired Hierarchical Prediction Network for Spatiotemporal Sequence Learning and Prediction

Jielin Qiu; Ge Huang; Tai Sing Lee

時空間シーケンス学習と予測のための神経に触発された階層的予測ネットワーク

この論文では、将来のビデオフレームを予測するために、視覚野階層の反復回路で時空間メモリを学習およびエンコードする方法を理解するために、階層予測ネットワーク（HPNet）と呼ばれる階層ネットワークモデルを開発しました。この神経に触発されたモデルは、合成による分析のフレームワークで動作します。これには、連続する複雑さの時空間特徴を計算およびエンコードするフィードフォワードパスと、連続するレベルのフィードバックパスが含まれており、それらの解釈を下のレベルに投影します。各レベル内で、フィードフォワードパスとフィードバックパスは、LSTMモジュールでインスタンス化された反復ゲート回路で交差し、入力信号の予測または説明を生成します。ネットワークは、階層の各レベルでの着信信号の予測のエラーを最小限に抑えることにより、世界の内部モデルを学習します。ネットワーク内の階層的相互作用により、最も初期のモジュールであっても、階層に沿ったユニットの母集団コードにおけるグローバルな移動パターンのセマンティッククラスタリングが増加することがわかりました。これにより、動きのパターン間の関係の学習が容易になり、ベンチマークデータセットの長距離ビデオシーケンス予測で最先端のパフォーマンスが得られます。ネットワークモデルは、視覚野で観察されるさまざまな予測抑制と親密度抑制の神経生理学的現象を自動的に再現します。これは、階層的予測が視覚野での表現学習の重要な原則である可能性を示唆しています。

In this paper we developed a hierarchical network model, called Hierarchical Prediction Network (HPNet), to understand how spatiotemporal memories might be learned and encoded in the recurrent circuits in the visual cortical hierarchy for predicting future video frames. This neurally inspired model operates in the analysis-by-synthesis framework. It contains a feed-forward path that computes and encodes spatiotemporal features of successive complexity and a feedback path for the successive levels to project their interpretations to the level below. Within each level, the feed-forward path and the feedback path intersect in a recurrent gated circuit, instantiated in a LSTM module, to generate a prediction or explanation of the incoming signals. The network learns its internal model of the world by minimizing the errors of its prediction of the incoming signals at each level of the hierarchy. We found that hierarchical interaction in the network increases semantic clustering of global movement patterns in the population codes of the units along the hierarchy, even in the earliest module. This facilitates the learning of relationships among movement patterns, yielding state-of-the-art performance in long range video sequence predictions in the benchmark datasets. The network model automatically reproduces a variety of prediction suppression and familiarity suppression neurophysiological phenomena observed in the visual cortex, suggesting that hierarchical prediction might indeed be an important principle for representational learning in the visual cortex.

updated: Fri Oct 01 2021 12:59:31 GMT+0000 (UTC)

published: Fri Jan 25 2019 18:03:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト