Similarity of Neural Architectures Based on Input Gradient Transferability

Jaehui Hwang; Dongyoon Han; Byeongho Heo; Song Park; Sanghyuk Chun; Jong-Seok Lee

入力勾配伝達性に基づくニューラルアーキテクチャの類似性

近年、画像分類のために膨大な数のディープニューラルアーキテクチャが開発されています。これらのモデルが類似しているか異なるか、またどのような要因がそれらの類似性または相違性に寄与しているかは興味深いままです。この問題に対処するために、ニューラルアーキテクチャ間の定量的でスケーラブルな類似関数を設計することを目指しています。モデルの動作を理解するために広く使用されている入力勾配と決定境界に関連する情報を持つ、敵対的攻撃の伝達可能性を利用します。質問に答えるために、提案された類似度関数を使用して、69 の最先端の ImageNet 分類器に対して大規模な分析を行います。さらに、モデルの多様性が特定の条件下でのモデルアンサンブルと知識の蒸留のパフォーマンスの向上につながる可能性があるモデルの類似性を使用して、ニューラルアーキテクチャ関連の現象を観察します。私たちの結果は、異なるコンポーネントを持つ多様なニューラルアーキテクチャの開発が必要な理由についての洞察を提供します。

In recent years, a huge amount of deep neural architectures have been developed for image classification. It remains curious whether these models are similar or different and what factors contribute to their similarities or differences. To address this question, we aim to design a quantitative and scalable similarity function between neural architectures. We utilize adversarial attack transferability, which has information related to input gradients and decision boundaries that are widely used to understand model behaviors. We conduct a large-scale analysis on 69 state-of-the-art ImageNet classifiers using our proposed similarity function to answer the question. Moreover, we observe neural architecture-related phenomena using model similarity that model diversity can lead to better performance on model ensembles and knowledge distillation under specific conditions. Our results provide insights into why the development of diverse neural architectures with distinct components is necessary.

updated: Wed Mar 15 2023 08:06:48 GMT+0000 (UTC)

published: Thu Oct 20 2022 16:56:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト