Efficient Adaptive Ensembling for Image Classification

Antonio Bruno; Davide Moroni; Massimo Martinelli

画像分類のための効率的な適応型アンサンブル

最近では、散発的なケースを除いて、コンピュータビジョンの傾向は、複雑さの大幅な増加に対してマイナーな改善を達成することです。この傾向を逆転させるために、複雑さを増すことなく画像分類性能を向上させる新しい方法を提案します。この目的のために、私たちはアンサンブルを再検討しました。これは強力なアプローチであり、複雑さとトレーニング時間が増加するという性質のために適切に使用されることは少なく、特定の設計の選択によって実行可能になります。最初に、データの互いに素なサブセット（つまり、バギング）について、エンドツーエンドの2つのEfficientNet-b0モデル（画像分類において全体的な精度と複雑さのトレードオフが最も優れているアーキテクチャとして知られています）をトレーニングしました。次に、トレーニング可能な組み合わせレイヤーの微調整を実行することにより、効率的な適応アンサンブルを作成しました。このようにして、パラメーターの数（5〜60倍）と浮動小数点演算/秒の両方の点で複雑さを抑えながら、精度を平均0.5％上回りました。グリーンAIを完全に取り入れた、いくつかの主要なベンチマークデータセットで（10〜100倍）。

In recent times, except for sporadic cases, the trend in Computer Vision is to achieve minor improvements over considerable increases in complexity. To reverse this tendency, we propose a novel method to boost image classification performances without an increase in complexity. To this end, we revisited ensembling, a powerful approach, not often adequately used due to its nature of increased complexity and training time, making it viable by specific design choices. First, we trained end-to-end two EfficientNet-b0 models (known to be the architecture with the best overall accuracy/complexity trade-off in image classification) on disjoint subsets of data (i.e. bagging). Then, we made an efficient adaptive ensemble by performing fine-tuning of a trainable combination layer. In this way, we were able to outperform the state-of-the-art by an average of 0.5% on the accuracy with restrained complexity both in terms of number of parameters (by 5-60 times), and FLoating point Operations Per Second (by 10-100 times) on several major benchmark datasets, fully embracing the green AI.

updated: Sat Aug 27 2022 09:06:43 GMT+0000 (UTC)

published: Wed Jun 15 2022 08:55:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト