Collision Cross-entropy and EM Algorithm for Self-labeled Classification

Zhongwen Zhang; Yuri Boykov

自己ラベル分類のための衝突クロスエントロピーと EM アルゴリズム

事後モデルを使用した自己ラベル付けされた分類のコンテキストで、シャノンのクロスエントロピーの堅牢な代替手段として「衝突クロスエントロピー」を提案します。ラベル付けされていないデータを想定すると、自己ラベル付けは、潜在的な疑似ラベル、カテゴリ分布 y を推定することによって機能します。これは、「決定性」や「公平性」などの識別クラスタリング基準を最適化します。既存のすべての自己ラベル付き損失には、推定分布 y でのモデル予測ソフトマックスをターゲットとするシャノンのクロスエントロピー項が組み込まれています。実際、softmax は y の不確実性を正確に模倣するようにトレーニングされています。代わりに、「衝突」の負の対数尤度を提案して、分布 softmax と y で表される 2 つの確率変数が等しい確率を最大化します。損失が一般化されたクロスエントロピーのいくつかの特性を満たすことを示します。興味深いことに、ワンホット疑似ラベル y のシャノンの交差エントロピーと一致しますが、よりソフトなラベルからのトレーニングは弱まります。たとえば、あるデータポイントで y が一様分布である場合、トレーニングへの寄与はゼロです。コリジョンクロスエントロピーと基本的なクラスタリング基準を組み合わせた自己ラベリング損失は、疑似ラベルに関して凸ですが、確率シンプレックスを最適化するには自明ではありません。疑似ラベル y を最適化する実用的な EM アルゴリズムを、一般的な方法、たとえば射影勾配降下法よりも大幅に高速に導出します。衝突交差エントロピーは、異なる DNN を使用した複数の自己ラベル付きクラスタリングの例で一貫して結果を改善します。

We propose "collision cross-entropy" as a robust alternative to the Shannon's cross-entropy in the context of self-labeled classification with posterior models. Assuming unlabeled data, self-labeling works by estimating latent pseudo-labels, categorical distributions y, that optimize some discriminative clustering criteria, e.g. "decisiveness" and "fairness". All existing self-labeled losses incorporate Shannon's cross-entropy term targeting the model prediction, softmax, at the estimated distribution y. In fact, softmax is trained to mimic the uncertainty in y exactly. Instead, we propose the negative log-likelihood of "collision" to maximize the probability of equality between two random variables represented by distributions softmax and y. We show that our loss satisfies some properties of a generalized cross-entropy. Interestingly, it agrees with the Shannon's cross-entropy for one-hot pseudo-labels y, but the training from softer labels weakens. For example, if y is a uniform distribution at some data point, it has zero contribution to the training. Our self-labeling loss combining collision cross entropy with basic clustering criteria is convex w.r.t. pseudo-labels, but non-trivial to optimize over the probability simplex. We derive a practical EM algorithm optimizing pseudo-labels y significantly faster than generic methods, e.g. the projectile gradient descent. The collision cross-entropy consistently improves the results on multiple self-labeled clustering examples using different DNNs.

updated: Mon Mar 13 2023 17:42:11 GMT+0000 (UTC)

published: Mon Mar 13 2023 17:42:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト