Evaluating Systemic Error Detection Methods using Synthetic Images

Gregory Plumb; Nari Johnson; Ángel Alexander Cabrera; Marco Tulio Ribeiro; Ameet Talwalkar

合成画像を使用した全身エラー検出方法の評価

画像分類器のブラインドスポット（つまり、システムエラー）を検出する方法を評価するために使用する合成データセットを生成するためのフレームワークであるSpotCheckを紹介します。 SpotCheckを使用して、さまざまな要因がブラインドスポット発見方法のパフォーマンスにどのように影響するかについての管理された調査を実行します。私たちの実験では、複数のブラインドスポットがある設定でのパフォーマンスが比較的低い、ハイパーパラメータに対する感度など、既存の方法のいくつかの欠点が明らかになっています。さらに、次元削減に基づく方法であるPlaneSpotは、対話型ツールの開発に有望な影響を与える既存の方法と競合することがわかりました。

We introduce SpotCheck, a framework for generating synthetic datasets to use for evaluating methods for discovering blindspots (i.e., systemic errors) in image classifiers. We use SpotCheck to run controlled studies of how various factors influence the performance of blindspot discovery methods. Our experiments reveal several shortcomings of existing methods, such as relatively poor performance in settings with multiple blindspots and sensitivity to hyperparameters. Further, we find that a method based on dimensionality reduction, PlaneSpot, is competitive with existing methods, which has promising implications for the development of interactive tools.

updated: Fri Jul 08 2022 19:02:50 GMT+0000 (UTC)

published: Fri Jul 08 2022 19:02:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト