Analysis and algorithms for ℓ_p-based semi-supervised learning on graphs

Mauricio Flores; Jeff Calder; Gilad Lerman

グラフでのℓ_pベースの半教師あり学習の分析とアルゴリズム

このホワイトペーパーでは、半教師あり学習におけるℓ_pベースのラプラシアン正則化の理論と応用について説明します。 p> 2のグラフp-ラプラシアンは、ラプラシアン学習が縮退している、ラベルが非常に少ない半教師あり学習問題における標準（p = 2）グラフラプラシアンの代わりとして最近提案されました。論文の最初の部分では、ランダム幾何学的グラフよりも実際に一般的に使用されるk最近傍（k-NN）グラフ上のp-ラプラス問題の新しい離散から連続への収束結果を証明します。私たちの分析は、k-NNグラフで、p-Laplacianがp \to∞としてデータ分布に関する情報を保持し、リプシッツ学習（p =∞）がデータ分布に敏感であることを示しています。この状況は、p-Laplacianがp \to∞としてのデータ分布を忘れるランダムな幾何学的グラフとは対照的です。また、点ごとの一貫性と単調性のみを必要とするグラフベースの学習で、離散から連続への収束結果を証明するための一般的なフレームワークを示します。論文の第2部では、p> 2の重み付きグラフで変分およびゲーム理論のp-ラプラス方程式を解くための高速アルゴリズムを開発します。両方の定式化のためのいくつかの効率的でスケーラブルなアルゴリズムを提示し、それらの収束特性を示す合成データの数値結果を提示します。最後に、MNIST、FashionMNIST、およびEMNISTデータセットで広範な数値実験を行い、少数のラベルを使用した半教師あり学習に対するp-Laplacian定式化の有効性を示します。特に、リプシッツ学習（p =∞）はk-NNグラフのラベルが非常に少ない場合に良好に機能することがわかります。これは、リプシッツ学習がk-NNグラフのデータ分布（ラベルなしデータ）に関する情報を保持するという理論的発見を実験的に検証します。。

This paper addresses theory and applications of ℓ_p-based Laplacian regularization in semi-supervised learning. The graph p-Laplacian for p>2 has been proposed recently as a replacement for the standard (p=2) graph Laplacian in semi-supervised learning problems with very few labels, where Laplacian learning is degenerate. In the first part of the paper we prove new discrete to continuum convergence results for p-Laplace problems on k-nearest neighbor (k-NN) graphs, which are more commonly used in practice than random geometric graphs. Our analysis shows that, on k-NN graphs, the p-Laplacian retains information about the data distribution as p\to ∞ and Lipschitz learning (p=∞) is sensitive to the data distribution. This situation can be contrasted with random geometric graphs, where the p-Laplacian forgets the data distribution as p\to ∞. We also present a general framework for proving discrete to continuum convergence results in graph-based learning that only requires pointwise consistency and monotonicity. In the second part of the paper, we develop fast algorithms for solving the variational and game-theoretic p-Laplace equations on weighted graphs for p>2. We present several efficient and scalable algorithms for both formulations, and present numerical results on synthetic data indicating their convergence properties. Finally, we conduct extensive numerical experiments on the MNIST, FashionMNIST and EMNIST datasets that illustrate the effectiveness of the p-Laplacian formulation for semi-supervised learning with few labels. In particular, we find that Lipschitz learning (p=∞) performs well with very few labels on k-NN graphs, which experimentally validates our theoretical findings that Lipschitz learning retains information about the data distribution (the unlabeled data) on k-NN graphs.

updated: Thu Jan 27 2022 15:32:32 GMT+0000 (UTC)

published: Tue Jan 15 2019 20:03:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト