On the Importance of Sampling in Training GCNs: Tighter Analysis and Variance Reduction

Weilin Cong; Morteza Ramezani; Mehrdad Mahdavi

GCNのトレーニングにおけるサンプリングの重要性について：より厳密な分析と分散の減少

グラフ畳み込みネットワーク（GCN）は、さまざまな半教師ありノード分類タスクにわたって印象的な経験的進歩を達成しました。それらの大きな成功にもかかわらず、大きなグラフでのGCNのトレーニングは、計算とメモリの問題に悩まされています。これらの障害を回避するための潜在的なパスは、各レイヤーでノードのサブセットがサンプリングされるサンプリングベースの方法です。最近の研究では、サンプリングベースの方法の有効性が経験的に実証されていますが、これらの作業は現実的な設定での理論的な収束保証がなく、最適化中に進化するパラメーターの情報を十分に活用できません。このホワイトペーパーでは、メモリバジェットの下で任意のサンプリング方法を高速化できる一般的な二重分散削減スキーマについて説明および分析します。提案されたスキーマの動機付けの推進力は、サンプリング方法の分散の注意深い分析であり、誘導された分散は、順方向伝搬中にノード埋め込み近似分散（0次分散）と層ごとの勾配分散（1次分散）に分解できることが示されています。逆伝播中の次数分散）。提案されたスキーマの収束を理論的に分析し、それがO（1 / T）収束率を享受していることを示します。提案されたスキーマをさまざまなサンプリング方法に統合し、それらをさまざまな大規模な実世界のグラフに適用することで、理論的な結果を補完します。

Graph Convolutional Networks (GCNs) have achieved impressive empirical advancement across a wide variety of semi-supervised node classification tasks. Despite their great success, training GCNs on large graphs suffers from computational and memory issues. A potential path to circumvent these obstacles is sampling-based methods, where at each layer a subset of nodes is sampled. Although recent studies have empirically demonstrated the effectiveness of sampling-based methods, these works lack theoretical convergence guarantees under realistic settings and cannot fully leverage the information of evolving parameters during optimization. In this paper, we describe and analyze a general doubly variance reduction schema that can accelerate any sampling method under the memory budget. The motivating impetus for the proposed schema is a careful analysis of the variance of sampling methods where it is shown that the induced variance can be decomposed into node embedding approximation variance (zeroth-order variance) during forward propagation and layerwise-gradient variance (first-order variance) during backward propagation. We theoretically analyze the convergence of the proposed schema and show that it enjoys an O(1/T) convergence rate. We complement our theoretical results by integrating the proposed schema in different sampling methods and applying them to different large real-world graphs.

updated: Mon Nov 01 2021 17:26:18 GMT+0000 (UTC)

published: Wed Mar 03 2021 21:31:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト