How Much Can We Really Trust You? Towards Simple, Interpretable Trust Quantification Metrics for Deep Neural Networks

Alexander Wong; Xiao Yu Wang; Andrew Hryniowski

どのくらい私たちは本当にあなたを信頼できますか？ディープニューラルネットワーク用のシンプルで解釈可能な信頼定量化メトリックに向けて

信頼できるディープニューラルネットワークを構築するための重要なステップは、信頼の定量化です。この研究では、一連の質問に答えるときの行動に基づいてディープニューラルネットワークの全体的な信頼性を評価するための一連のメトリックを導入することにより、信頼定量化のためのシンプルで解釈可能なメトリックに向けて一歩を踏み出します。私たちは思考実験を行い、信頼との関係で信頼に関する2つの重要な質問を検討します。1）非常に自信を持って間違った回答をする俳優には、どれくらいの信頼がありますか？ 2）躊躇して正しい答えを出す俳優にどれだけの信頼がありますか？得られた洞察に基づいて、正解と不正解のシナリオでの自信のある行動に基づいて個々の回答の信頼性を定量化する質問と回答の信頼の概念と、個々の回答シナリオの全体的な信頼の分布を特徴付ける信頼密度の概念を導入します。さらに、正解と不正解の質問全体で考えられる回答シナリオのスペクトルに関して全体的な信頼を表すための信頼スペクトルの概念を紹介します。最後に、全体的な信頼性を要約したスカラーメトリックであるNetTrustScoreを紹介します。一連の指標は、信頼と信頼の関係を研究する過去の社会心理学研究と一致しています。これらのメトリクスを活用して、画像認識のためのいくつかのよく知られているディープニューラルネットワークアーキテクチャの信頼性を定量化して、信頼がどこで崩れるかをより深く理解します。提案されている指標は決して完璧なものではありませんが、実際のミッションクリティカルな運用で信頼できるディープラーニングソリューションの作成、展開、および認定を行う際の実践者と規制者のガイドとなるように、より良い指標に向けて会話を進めることが期待されます。シナリオ。

A critical step to building trustworthy deep neural networks is trust quantification, where we ask the question: How much can we trust a deep neural network? In this study, we take a step towards simple, interpretable metrics for trust quantification by introducing a suite of metrics for assessing the overall trustworthiness of deep neural networks based on their behaviour when answering a set of questions. We conduct a thought experiment and explore two key questions about trust in relation to confidence: 1) How much trust do we have in actors who give wrong answers with great confidence? and 2) How much trust do we have in actors who give right answers hesitantly? Based on insights gained, we introduce the concept of question-answer trust to quantify trustworthiness of an individual answer based on confident behaviour under correct and incorrect answer scenarios, and the concept of trust density to characterize the distribution of overall trust for an individual answer scenario. We further introduce the concept of trust spectrum for representing overall trust with respect to the spectrum of possible answer scenarios across correctly and incorrectly answered questions. Finally, we introduce NetTrustScore, a scalar metric summarizing overall trustworthiness. The suite of metrics aligns with past social psychology studies that study the relationship between trust and confidence. Leveraging these metrics, we quantify the trustworthiness of several well-known deep neural network architectures for image recognition to get a deeper understanding of where trust breaks down. The proposed metrics are by no means perfect, but the hope is to push the conversation towards better metrics to help guide practitioners and regulators in producing, deploying, and certifying deep learning solutions that can be trusted to operate in real-world, mission-critical scenarios.

updated: Sat Apr 03 2021 15:08:50 GMT+0000 (UTC)

published: Sat Sep 12 2020 17:37:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト