Distilling BlackBox to Interpretable models for Efficient Transfer Learning

Shantanu Ghosh; Ke Yu; Kayhan Batmanghelich

効率的な転移学習のためにブラックボックスを解釈可能なモデルに抽出する

一般化可能な AI モデルの構築は、ヘルスケア領域における主要な課題の 1 つです。放射線科医は異常に関する一般化可能な記述ルールに依存していますが、ニューラルネットワーク (NN) モデルは、入力分布 (スキャナーの種類など) がわずかに変化した場合でも影響を受けます。モデルを微調整してあるドメインから別のドメインに知識を伝達するには、ターゲットドメインに大量のラベル付きデータが必要です。この論文では、最小限の計算コストで目に見えないターゲット領域に合わせて効率的に微調整できる解釈可能なモデルを開発します。 NN の解釈可能なコンポーネントはほぼドメイン不変であると仮定します。ただし、解釈可能なモデルは通常、ブラックボックス (BB) バリアントと比較してパフォーマンスが劣ります。ソースドメインの BB から始めて、それを人間が理解できる概念を使用して浅い解釈可能なモデルの混合物に抽出します。各解釈可能なモデルはデータのサブセットをカバーするため、解釈可能なモデルを混合すると BB と同等のパフォーマンスが実現されます。さらに、半教師あり学習 (SSL) の擬似ラベル付け手法を使用して、ターゲットドメインの概念分類器を学習し、その後、ターゲットドメインで解釈可能なモデルを微調整します。実際の大規模胸部 X 線 (CXR) 分類データセットを使用してモデルを評価します。コードは https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs で入手できます。

Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (e.g. scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a mixture of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs.

updated: Fri May 26 2023 23:23:48 GMT+0000 (UTC)

published: Fri May 26 2023 23:23:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト