Likelihood Annealing: Fast Calibrated Uncertainty for Regression

Uddeshya Upadhyay; Jae Myung Kim; Cordelia Schmidt; Bernhard Schölkopf; Zeynep Akata

尤度アニーリング: 回帰のための高速校正された不確実性

深層学習の最近の進歩により、医療画像処理、自然言語処理、自律システムなどのアプリケーションにおいて不確実性推定の重要性がますます高まっていることが示されています。ただし、不確実性を正確に定量化することは、特に出力空間が連続である回帰タスクでは依然として困難な問題です。回帰問題の不確実性推定を可能にする深層学習アプローチは、多くの場合、収束が遅く、定量化に効果的に使用できない不確実性推定値の校正が不十分です。最近提案された事後キャリブレーション手法は、回帰問題にはほとんど適用できず、すでに遅いモデルトレーニングフェーズにオーバーヘッドが追加されることがよくあります。この研究では、尤度アニーリングと呼ばれる回帰タスク用の高速校正された不確実性推定方法を紹介します。これは、深層回帰モデルの収束を一貫して改善し、事後校正フェーズなしで校正された不確実性をもたらします。低次元回帰問題のみに焦点を当てた回帰の不確実性を校正するためのこれまでの方法とは異なり、私たちの方法は高次元回帰を含む幅広い回帰問題にうまく機能します。私たちの実証分析は、私たちのアプローチがさまざまなネットワークアーキテクチャに一般化できることを示しています。多層パーセプトロン、1D/2D畳み込みネットワーク、グラフニューラルネットワークを含む5つの非常に多様なタスク、すなわちカオス粒子軌道のノイズ除去、3D原子表現を使用した分子の物性予測、自然画像超解像、MRIを使用した医用画像変換。

Recent advances in deep learning have shown that uncertainty estimation is becoming increasingly important in applications such as medical imaging, natural language processing, and autonomous systems. However, accurately quantifying uncertainty remains a challenging problem, especially in regression tasks where the output space is continuous. Deep learning approaches that allow uncertainty estimation for regression problems often converge slowly and yield poorly calibrated uncertainty estimates that can not be effectively used for quantification. Recently proposed post hoc calibration techniques are seldom applicable to regression problems and often add overhead to an already slow model training phase. This work presents a fast calibrated uncertainty estimation method for regression tasks called Likelihood Annealing, that consistently improves the convergence of deep regression models and yields calibrated uncertainty without any post hoc calibration phase. Unlike previous methods for calibrated uncertainty in regression that focus only on low-dimensional regression problems, our method works well on a broad spectrum of regression problems, including high-dimensional regression.Our empirical analysis shows that our approach is generalizable to various network architectures, including multilayer perceptrons, 1D/2D convolutional networks, and graph neural networks, on five vastly diverse tasks, i.e., chaotic particle trajectory denoising, physical property prediction of molecules using 3D atomistic representation, natural image super-resolution, and medical image translation using MRI.

updated: Sun Jul 02 2023 13:01:05 GMT+0000 (UTC)

published: Tue Feb 21 2023 21:24:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト