Deep clustering: On the link between discriminative models and K-means

Mohammed Jabi; Marco Pedersoli; Amar Mitiche; Ismail Ben Ayed

ディープクラスタリング：判別モデルとK-meansのリンクについて

最近のディープクラスタリング研究のコンテキストでは、判別モデルが文献を支配しており、最も競争力のあるパフォーマンスを報告しています。これらのモデルは、ラベルが潜在的である深い識別ニューラルネットワーク分類子を学習します。通常、教師あり学習では非常に一般的であるように、それらは多項ロジスティック回帰事後およびパラメーター正則化を使用します。一般的に、識別目的関数（たとえば、相互情報またはKL発散に基づくもの）は、データ分布に関する仮定を少なくするという意味で、生成的アプローチ（たとえば、K-平均）よりも柔軟であると認識されています。、はるかに優れた教師なしディープラーニングの結果が得られます。表面的には、最近のいくつかの識別モデルはK平均とは無関係に見えるかもしれません。この研究は、これらのモデルが、実際には、穏やかな条件と一般的な事後モデルおよびパラメーター正則化の下でのK-meansと同等であることを示しています。一般的に使用されるロジスティック回帰事後では、近似交互方向法（ADM）を介したL_2正則化相互情報の最大化は、ソフトで正則化されたK平均損失に等しいことを証明します。私たちの理論的分析は、最近のいくつかの最先端の判別モデルをK平均に直接接続するだけでなく、新しいソフトで正規化されたディープK平均アルゴリズムを導き、いくつかの画像クラスタリングベンチマークで競争力のあるパフォーマンスをもたらします。

In the context of recent deep clustering studies, discriminative models dominate the literature and report the most competitive performances. These models learn a deep discriminative neural network classifier in which the labels are latent. Typically, they use multinomial logistic regression posteriors and parameter regularization, as is very common in supervised learning. It is generally acknowledged that discriminative objective functions (e.g., those based on the mutual information or the KL divergence) are more flexible than generative approaches (e.g., K-means) in the sense that they make fewer assumptions about the data distributions and, typically, yield much better unsupervised deep learning results. On the surface, several recent discriminative models may seem unrelated to K-means. This study shows that these models are, in fact, equivalent to K-means under mild conditions and common posterior models and parameter regularization. We prove that, for the commonly used logistic regression posteriors, maximizing the L_2 regularized mutual information via an approximate alternating direction method (ADM) is equivalent to a soft and regularized K-means loss. Our theoretical analysis not only connects directly several recent state-of-the-art discriminative models to K-means, but also leads to a new soft and regularized deep K-means algorithm, which yields competitive performance on several image clustering benchmarks.

updated: Sun Dec 15 2019 23:28:05 GMT+0000 (UTC)

published: Tue Oct 09 2018 21:17:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト