FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity

Shiwei Liu; Tianlong Chen; Zahra Atashgahi; Xiaohan Chen; Ghada Sokar; Elena Mocanu; Mykola Pechenizkiy; Zhangyang Wang; Decebal Constantin Mocanu

FreeTickets：動的スパース性を使用したトレーニングによる正確で堅牢かつ効率的なディープアンサンブル

スパースニューラルネットワークに関する最近の研究は、対応する高密度ネットワークのパフォーマンスをパラメータの一部と一致させるために、スパースネットワークを分離してトレーニングすることが可能であることを示しています。ただし、これらのパフォーマンスの高いスパースニューラルネットワーク（チケットの獲得）の識別には、コストのかかる反復的なトレイン-プルーン-再トレーニングプロセス（例：宝くじの仮説）または過度に延長されたスパーストレーニング時間（例：動的スパース性を使用したトレーニング）のいずれかが含まれます。そのうちの財政的および環境的懸念を提起するでしょう。この作業では、FreeTicketsの概念を導入することで、このコスト削減の問題に対処しようとします。これは、完全なトレーニングにのみ使用しながら、スパース畳み込みニューラルネットワークのパフォーマンスを高密度ネットワークの同等のものよりも大幅に向上させることができる最初のソリューションです。後者に必要な計算リソースの一部。具体的には、動的なスパース性を備えた2つの新しい効率的なアンサンブル手法を提案することにより、FreeTicketsの概念をインスタンス化します。これにより、スパーストレーニングプロセス中に多くの多様で正確なチケットが「無料」で生成されます。これらの無料チケットを組み合わせてアンサンブルを作成すると、対応する密な（アンサンブル）ネットワークよりも精度、不確実性の推定、堅牢性、効率が大幅に向上します。私たちの結果は、スパースニューラルネットワークの強さに対する新しい洞察を提供し、スパース性の利点が通常のトレーニング/推論の期待される効率をはるかに超えていることを示唆しています。 https://github.com/Shiweiliuiiiiiii/FreeTicketsですべてのコードをリリースします。

Recent works on sparse neural networks have demonstrated that it is possible to train a sparse network in isolation to match the performance of the corresponding dense networks with a fraction of parameters. However, the identification of these performant sparse neural networks (winning tickets) either involves a costly iterative train-prune-retrain process (e.g., Lottery Ticket Hypothesis) or an over-extended sparse training time (e.g., Training with Dynamic Sparsity), both of which would raise financial and environmental concerns. In this work, we attempt to address this cost-reducing problem by introducing the FreeTickets concept, as the first solution which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin, while using for complete training only a fraction of the computational resources required by the latter. Concretely, we instantiate the FreeTickets concept, by proposing two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process. The combination of these free tickets into an ensemble demonstrates a significant improvement in accuracy, uncertainty estimation, robustness, and efficiency over the corresponding dense (ensemble) networks. Our results provide new insights into the strength of sparse neural networks and suggest that the benefits of sparsity go way beyond the usual training/inference expected efficiency. We will release all codes in https://github.com/Shiweiliuiiiiiii/FreeTickets.

updated: Mon Jun 28 2021 10:48:20 GMT+0000 (UTC)

published: Mon Jun 28 2021 10:48:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト