Distribution-Free, Risk-Controlling Prediction Sets

Stephen Bates; Anastasios Angelopoulos; Lihua Lei; Jitendra Malik; Michael I. Jordan

分布によらない、リスクを制御する予測セット

予測タスクのインスタンスごとの不確実性を伝えるために、ユーザー指定のレベルで将来のテストポイントで予想される損失を制御するブラックボックス予測子の設定値予測を生成する方法を示します。私たちのアプローチは、ホールドアウトセットを使用して予測セットのサイズを調整することにより、任意のデータセットに明示的な有限サンプル保証を提供します。このフレームワークにより、多くのタスクでシンプルで分散によらず厳密なエラー制御が可能になり、5つの大規模な機械学習の問題でそれを示します。（1）一部のミスが他のミスよりもコストがかかる分類問題。（2）マルチラベル分類。各観測には複数のラベルが関連付けられています。（3）ラベルが階層構造を持っている分類問題。（4）画像セグメンテーション。対象のオブジェクトを含むピクセルのセットを予測します。（5）タンパク質構造予測。最後に、ランク付け、メトリック学習、および分布的にロバストな学習のための不確実性定量化の拡張について説明します。

To communicate instance-wise uncertainty for prediction tasks, we show how to generate set-valued predictions for black-box predictors that control the expected loss on future test points at a user-specified level. Our approach provides explicit finite-sample guarantees for any dataset by using a holdout set to calibrate the size of the prediction sets. This framework enables simple, distribution-free, rigorous error control for many tasks, and we demonstrate it in five large-scale machine learning problems: (1) classification problems where some mistakes are more costly than others; (2) multi-label classification, where each observation has multiple associated labels; (3) classification problems where the labels have a hierarchical structure; (4) image segmentation, where we wish to predict a set of pixels containing an object of interest; and (5) protein structure prediction. Lastly, we discuss extensions to uncertainty quantification for ranking, metric learning and distributionally robust learning.

updated: Thu Jan 07 2021 18:59:33 GMT+0000 (UTC)

published: Thu Jan 07 2021 18:59:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト