Evaluation of Confidence-based Ensembling in Deep Learning Image Classification

Rafael Rosales; Peter Popov; Michael Paulitsch

深層学習画像分類における信頼度ベースのアンサンブルの評価

アンサンブルは、機械学習 (ML) モデルのパフォーマンスを改善するための成功した手法です。 Conf-Ensemble は、困難なエッジケースをより適切に分類するために、モデルエラーではなくモデルの信頼度に基づいてアンサンブルを作成するための Boosting への適応です。重要なアイデアは、前のモデルでは困難であった (必ずしも誤って分類されたわけではない) サンプルに対して、連続するモデルエキスパートを作成することです。この手法は、小さな特徴空間 (~80 特徴) を使用したバイナリ分類でのブースティングよりも優れた結果を提供することが示されています。このホワイトペーパーでは、ImageNet データセット (1000 クラスの 224x224x3 フィーチャ) を使用した、はるかに複雑な画像分類タスクにおける Conf-Ensemble アプローチを評価します。画像分類は、AI ベースの認識の重要なベンチマークであるため、ML アンサンブルを使用したセーフティクリティカルなアプリケーションでこの方法を使用できるかどうかを評価するのに役立ちます。私たちの実験は、複雑なマルチラベル分類タスクでは、複雑な入力サンプルの特殊化の期待される利点は、小さなサンプルセットでは達成できないことを示しています。つまり、優れた分類器は、十分にトレーニングできない非常に複雑な特徴分析に依存しているようです。「困難なサンプル」の限られたサブセットにすぎません。連続するアンサンブルメンバーに供給されるサンプルの数を増やすために、Conf-Ensemble の改善を提案します。この改善を使用した 3 メンバーの Conf-Ensemble は、量は重要ではありませんが、単一のモデルを精度で上回ることができました。私たちの調査結果は、アプローチの限界と、ビッグデータを活用することの非自明性を明らかにしています。

Ensembling is a successful technique to improve the performance of machine learning (ML) models. Conf-Ensemble is an adaptation to Boosting to create ensembles based on model confidence instead of model errors to better classify difficult edge-cases. The key idea is to create successive model experts for samples that were difficult (not necessarily incorrectly classified) by the preceding model. This technique has been shown to provide better results than boosting in binary-classification with a small feature space (~80 features). In this paper, we evaluate the Conf-Ensemble approach in the much more complex task of image classification with the ImageNet dataset (224x224x3 features with 1000 classes). Image classification is an important benchmark for AI-based perception and thus it helps to assess if this method can be used in safety-critical applications using ML ensembles. Our experiments indicate that in a complex multi-label classification task, the expected benefit of specialization on complex input samples cannot be achieved with a small sample set, i.e., a good classifier seems to rely on very complex feature analysis that cannot be well trained on just a limited subset of "difficult samples". We propose an improvement to Conf-Ensemble to increase the number of samples fed to successive ensemble members, and a three-member Conf-Ensemble using this improvement was able to surpass a single model in accuracy, although the amount is not significant. Our findings shed light on the limits of the approach and the non-triviality of harnessing big data.

updated: Fri Mar 03 2023 16:29:22 GMT+0000 (UTC)

published: Fri Mar 03 2023 16:29:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト