Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods

Ashkan Khakzar; Soroosh Baselizadeh; Nassir Navab

グラジエントベースの顕著性法におけるグラジエントの正の凝集と伝播の再考

顕著性メソッドは、ニューラルネットワークの予測に対する入力要素の重要性を示すことにより、その予測を解釈します。顕著性手法の人気のあるファミリは、勾配情報を利用します。この作業では、勾配情報を処理するための2つのアプローチ、つまり正の集約と正の伝播がこれらの方法を破ることを経験的に示します。これらの方法は入力の視覚的に顕著な情報を反映しますが、生成された顕著性マップは予測された出力の影響を受けず、モデルパラメータのランダム化の影響を受けないため、モデルの予測については説明しません。特に、GradCAM ++やFullGradなど、選択したレイヤーのグラジエントを集約する方法の場合、正のグラジエントのみを集約することは有害です。さらに、勾配情報を積極的に処理する集計方法のいくつかのバリエーションを提案することで、これをサポートします。 LRP、RectGrad、Guided Backpropagationなどの勾配情報を逆伝播するメソッドの場合、正の勾配情報のみを伝播することによる破壊的な影響を示します。

Saliency methods interpret the prediction of a neural network by showing the importance of input elements for that prediction. A popular family of saliency methods utilize gradient information. In this work, we empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods. Though these methods reflect visually salient information in the input, they do not explain the model prediction anymore as the generated saliency maps are insensitive to the predicted output and are insensitive to model parameter randomization. Specifically for methods that aggregate the gradients of a chosen layer such as GradCAM++ and FullGrad, exclusively aggregating positive gradients is detrimental. We further support this by proposing several variants of aggregation methods with positive handling of gradient information. For methods that backpropagate gradient information such as LRP, RectGrad, and Guided Backpropagation, we show the destructive effect of exclusively propagating positive gradient information.

updated: Tue Dec 01 2020 09:38:54 GMT+0000 (UTC)

published: Tue Dec 01 2020 09:38:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト