LLM2Loss: Leveraging Language Models for Explainable Model Diagnostics

Shervin Ardeshir

LLM2Loss: 説明可能なモデル診断のための言語モデルの活用

膨大な量のデータでトレーニングされた大規模言語モデル (LLM) は、抽象空間でかなり複雑なテキスト入力をモデル化する上で前例のない成功と一般化を達成し、ゼロショット学習のための強力なツールとなっています。このような機能は、CLIP などのクロスモーダル基盤モデルを使用して視覚ドメインなどの他のモダリティに拡張され、その結果、意味的に意味のある表現が視覚入力から抽出可能になります。この作業では、この機能を活用し、モデルの失敗とバイアスのパターンにセマンティックな洞察を提供できるアプローチを提案します。ブラックボックスモデル、そのトレーニングデータ、およびタスク定義が与えられると、最初に各データポイントのタスク関連の損失を計算します。次に、各トレーニングデータポイントの意味的に意味のある表現 (ビジュアルエンコーダーからの CLIP 埋め込みなど) を抽出し、データポイントのこの意味的に意味のある表現をタスクロスにマッピングする軽量の診断モデルをトレーニングします。このような軽量モデルのアンサンブルを使用して、失敗とバイアスのパターンを特定するという点で、ブラックボックスモデルのパフォーマンスに関する洞察を生成できることを示します。

Trained on a vast amount of data, Large Language models (LLMs) have achieved unprecedented success and generalization in modeling fairly complex textual inputs in the abstract space, making them powerful tools for zero-shot learning. Such capability is extended to other modalities such as the visual domain using cross-modal foundation models such as CLIP, and as a result, semantically meaningful representation are extractable from visual inputs. In this work, we leverage this capability and propose an approach that can provide semantic insights into a model's patterns of failures and biases. Given a black box model, its training data, and task definition, we first calculate its task-related loss for each data point. We then extract a semantically meaningful representation for each training data point (such as CLIP embeddings from its visual encoder) and train a lightweight diagnosis model which maps this semantically meaningful representation of a data point to its task loss. We show that an ensemble of such lightweight models can be used to generate insights on the performance of the black-box model, in terms of identifying its patterns of failures and biases.

updated: Thu May 04 2023 23:54:37 GMT+0000 (UTC)

published: Thu May 04 2023 23:54:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト