Seeing All From a Few: Nodes Selection Using Graph Pooling for Graph Clustering

Yiming Wang; Dongxia Chang; Zhiqian Fu; Yao Zhao

少数からすべてを見る: グラフクラスタリングのためのグラフプーリングを使用したノードの選択

近年，グラフ情報を用いたデータ分割を目的としたグラフクラスタリングの研究が盛んに行われている．ただし、ほとんどのグラフベースのメソッドの 1 つの制限は、操作するグラフ構造が固定されており、信頼できると想定していることです。また、グラフには必然的にグラフのクラスタリングに適さないエッジがいくつかあります。これをスプリアスエッジと呼びます。この論文は、ノードクラスタリングにグラフプーリング手法を採用する最初の試みであり、グラフ埋め込みを学習するためにグラフプーリング層によって接続された2段階のグラフエンコーダとして設計された新しいデュアルグラフ埋め込みネットワーク(DGEN)を提案します。このモデルでは、ノードとその最も近い隣接ノードが同じクラスタリングセンターに近い場合、このノードは情報を提供するノードであり、このエッジはクラスターに適したエッジと見なすことができると想定されています。この仮定に基づいて、隣接クラスタプーリング (NCPool) が考案され、ノードの最も情報量の多いサブセットと対応するエッジを、ノードとその最も近い隣接ノードからクラスタ中心までの距離に基づいて選択します。これにより、クラスタリングにおけるスプリアスエッジの影響を効果的に軽減できます。最後に、すべてのノードのクラスタリング割り当てを取得するために、選択したノードのクラスタリング結果を使用して分類器をトレーニングします。 5 つのベンチマークグラフデータセットでの実験は、最先端のアルゴリズムに対する提案された方法の優位性を示しています。

Recently, there has been considerable research interest in graph clustering aimed at data partition using the graph information. However, one limitation of the most of graph-based methods is that they assume the graph structure to operate is fixed and reliable. And there are inevitably some edges in the graph that are not conducive to graph clustering, which we call spurious edges. This paper is the first attempt to employ graph pooling technique for node clustering and we propose a novel dual graph embedding network (DGEN), which is designed as a two-step graph encoder connected by a graph pooling layer to learn the graph embedding. In our model, it is assumed that if a node and its nearest neighboring node are close to the same clustering center, this node is an informative node and this edge can be considered as a cluster-friendly edge. Based on this assumption, the neighbor cluster pooling (NCPool) is devised to select the most informative subset of nodes and the corresponding edges based on the distance of nodes and their nearest neighbors to the cluster centers. This can effectively alleviate the impact of the spurious edges on the clustering. Finally, to obtain the clustering assignment of all nodes, a classifier is trained using the clustering results of the selected nodes. Experiments on five benchmark graph datasets demonstrate the superiority of the proposed method over state-of-the-art algorithms.

updated: Tue Jun 08 2021 02:51:53 GMT+0000 (UTC)

published: Fri Apr 30 2021 06:51:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト