Constraining Representations Yields Models That Know What They Don't Know

Joao Monteiro; Pau Rodriguez; Pierre-Andre Noel; Issam Laradji; David Vazquez

表現の制約は、自分が知らないことを知っているモデルを生み出す

ニューラルネットワークのよく知られた失敗モードは、誤った予測を自信を持って返す可能性があることです。このような安全でない動作は、ユースケースがトレーニングコンテキストとわずかに異なる場合、および/または敵対者が存在する場合に特に頻繁に発生します。この作業は、これらの問題に広く一般的な方法で対処するための新しい方向性を示しています。つまり、モデルの内部アクティベーションパターンにクラスを意識した制約を課すことです。具体的には、各クラスに一意の固定されたランダムに生成されたバイナリベクトル (以下、クラスコードと呼びます) を割り当て、モデルをトレーニングして、入力サンプルのクラスに従って適切なクラスコードを予測するように、クロス深度のアクティベーションパターンを作成します。結果として得られる予測子は Total Activation Classifiers (TAC) と呼ばれ、TAC は最初からトレーニングされるか、凍結された事前トレーニング済みのニューラルネットワーク上に薄いアドオンとしてごくわずかなコストで使用されます。 TAC のアクティベーションパターンと最も近い有効なコードとの間の距離は、デフォルトの非 TAC の予測ヘッドに加えて、追加の信頼スコアとして機能します。アドオンの場合、元のニューラルネットワークの推論ヘッドはまったく影響を受けません (そのため、その精度は同じままです) が、仮想の生産ワークフローでどのアクションを実行するかを決定する際に、TAC 独自の信頼と予測を使用するオプションがあります。 .特に、TAC は拒否/延期を許可されたモデルから導出された値を厳密に改善することを示しています。 TAC が複数のタイプのアーキテクチャとデータモダリティでうまく機能し、既存のモデルから導出された最先端の代替信頼スコアと少なくとも同程度であるというさらなる経験的証拠を提供します。

A well-known failure mode of neural networks is that they may confidently return erroneous predictions. Such unsafe behaviour is particularly frequent when the use case slightly differs from the training context, and/or in the presence of an adversary. This work presents a novel direction to address these issues in a broad, general manner: imposing class-aware constraints on a model's internal activation patterns. Specifically, we assign to each class a unique, fixed, randomly-generated binary vector - hereafter called class code - and train the model so that its cross-depths activation patterns predict the appropriate class code according to the input sample's class. The resulting predictors are dubbed Total Activation Classifiers (TAC), and TACs may either be trained from scratch, or used with negligible cost as a thin add-on on top of a frozen, pre-trained neural network. The distance between a TAC's activation pattern and the closest valid code acts as an additional confidence score, besides the default unTAC'ed prediction head's. In the add-on case, the original neural network's inference head is completely unaffected (so its accuracy remains the same) but we now have the option to use TAC's own confidence and prediction when determining which course of action to take in an hypothetical production workflow. In particular, we show that TAC strictly improves the value derived from models allowed to reject/defer. We provide further empirical evidence that TAC works well on multiple types of architectures and data modalities and that it is at least as good as state-of-the-art alternative confidence scores derived from existing models.

updated: Wed Apr 19 2023 10:56:42 GMT+0000 (UTC)

published: Tue Aug 30 2022 18:28:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト