Dynamic Graph Message Passing Networks

Li Zhang; Mohan Chen; Anurag Arnab; Xiangyang Xue; Philip H. S. Torr

動的グラフメッセージパッシングネットワーク

長期的な依存関係のモデリングは、コンピュータビジョンのシーン理解タスクにとって重要です。畳み込みニューラルネットワーク（CNN）は多くのビジョンタスクに優れていますが、通常はローカルカーネルのレイヤーで構成されているため、長距離の構造化された関係のキャプチャにはまだ制限があります。 Transformersの自己注意操作などの完全接続グラフは、このようなモデリングに役立ちますが、その計算オーバーヘッドは法外です。この論文では、完全接続グラフをモデル化する関連作業と比較して、計算の複雑さを大幅に軽減する動的グラフメッセージパッシングネットワークを提案します。これは、メッセージパッシングのために、入力を条件としてグラフ内のノードを適応的にサンプリングすることによって実現されます。サンプリングされたノードに基づいて、ノードに依存するフィルターの重みと、ノード間で情報を伝播するためのアフィニティー行列を動的に予測します。この定式化により、自己注意モジュール、さらに重要なことに、画像分類の事前トレーニングとさまざまなダウンストリームタスク（オブジェクト検出、インスタンス、セマンティックセグメンテーションなど）のアドレス指定の両方に使用する新しいTransformerベースのバックボーンネットワークを設計できます。このモデルを使用して、4つの異なるタスクに関する強力な最先端のベースラインに関して大幅な改善を示します。また、私たちのアプローチは、完全に接続されたグラフよりも優れていますが、使用する浮動小数点演算とパラメーターは大幅に少なくなっています。コードとモデルはhttps://github.com/fudan-zvg/DGMN2で公開されます

Modelling long-range dependencies is critical for scene understanding tasks in computer vision. Although convolution neural networks (CNNs) have excelled in many vision tasks, they are still limited in capturing long-range structured relationships as they typically consist of layers of local kernels. A fully-connected graph, such as the self-attention operation in Transformers, is beneficial for such modelling, however, its computational overhead is prohibitive. In this paper, we propose a dynamic graph message passing network, that significantly reduces the computational complexity compared to related works modelling a fully-connected graph. This is achieved by adaptively sampling nodes in the graph, conditioned on the input, for message passing. Based on the sampled nodes, we dynamically predict node-dependent filter weights and the affinity matrix for propagating information between them. This formulation allows us to design a self-attention module, and more importantly a new Transformer-based backbone network, that we use for both image classification pretraining, and for addressing various downstream tasks (e.g. object detection, instance and semantic segmentation). Using this model, we show significant improvements with respect to strong, state-of-the-art baselines on four different tasks. Our approach also outperforms fully-connected graphs while using substantially fewer floating-point operations and parameters. Code and models will be made publicly available at https://github.com/fudan-zvg/DGMN2

updated: Sun May 01 2022 05:12:09 GMT+0000 (UTC)

published: Mon Aug 19 2019 17:46:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト