Excitement Surfeited Turns to Errors: Deep Learning Testing Framework Based on Excitable Neurons

Haibo Jin; Ruoxi Chen; Haibin Zheng; Jinyin Chen; Yao Cheng; Yue Yu; Xianglong Liu

興奮サーフェイトがエラーに変わる: 興奮性ニューロンに基づくディープラーニングテストフレームワーク

印象的な機能と優れたパフォーマンスにもかかわらず、ディープニューラルネットワーク (DNN) は、頻繁に発生する誤った動作のために、セキュリティ上の問題に対する世間の関心の高まりを捉えています。したがって、実際のアプリケーションに展開する前に、DNN の体系的なテストを実施する必要があります。既存のテスト方法は、ニューロンのカバレッジに基づいたきめ細かい指標を提供し、そのような指標を改善するためのさまざまなアプローチを提案しています。ただし、ニューロンのカバレッジが高いからといって、エラーにつながる欠陥を特定する能力が必ずしも優れているとは限らないことが徐々にわかってきました。さらに、カバレッジに基づく方法では、トレーニング手順が間違っているため、エラーを見つけることができません。したがって、これらのテスト例による再トレーニングによる DNN のロバスト性の向上は不十分です。この課題に対処するために、Shapley 値に基づく興奮性ニューロンの概念を導入し、DNN 用の新しいホワイトボックステストフレームワーク、DeepSensor を設計します。小さな摂動によるモデル損失の変化に対してより大きな責任を持つニューロンは、潜在的な欠陥による誤ったコーナーケースに関連している可能性が高いという観察に動機付けられています。モデルのさまざまな間違った動作に関する興奮性ニューロンの数を最大化することにより、DeepSensor は、敵対的な入力、汚染されたデータ、および不完全なトレーニングにより、より多くのエラーを効果的にトリガーするテスト例を生成できます。画像分類モデルと話者認識モデルの両方で実施された広範な実験により、DeepSensor の優位性が実証されました。

Despite impressive capabilities and outstanding performance, deep neural networks (DNNs) have captured increasing public concern about their security problems, due to their frequently occurred erroneous behaviors. Therefore, it is necessary to conduct a systematical testing for DNNs before they are deployed to real-world applications. Existing testing methods have provided fine-grained metrics based on neuron coverage and proposed various approaches to improve such metrics. However, it has been gradually realized that a higher neuron coverage does not necessarily represent better capabilities in identifying defects that lead to errors. Besides, coverage-guided methods cannot hunt errors due to faulty training procedure. So the robustness improvement of DNNs via retraining by these testing examples are unsatisfactory. To address this challenge, we introduce the concept of excitable neurons based on Shapley value and design a novel white-box testing framework for DNNs, namely DeepSensor. It is motivated by our observation that neurons with larger responsibility towards model loss changes due to small perturbations are more likely related to incorrect corner cases due to potential defects. By maximizing the number of excitable neurons concerning various wrong behaviors of models, DeepSensor can generate testing examples that effectively trigger more errors due to adversarial inputs, polluted data and incomplete training. Extensive experiments implemented on both image classification models and speaker recognition models have demonstrated the superiority of DeepSensor.

updated: Sun Nov 20 2022 14:50:45 GMT+0000 (UTC)

published: Sat Feb 12 2022 16:44:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト