Robust Classification with Context-Sensitive Features

Peter D. Turney

状況依存機能を備えた堅牢な分類

このホワイトペーパーでは、特にテストセットにトレーニングセットとは異なるコンテキストが含まれる場合に、機能がコンテキスト依存の場合に観測を分類する問題に対処します。このペーパーは、問題の正確な定義から始まり、このタイプの問題に関する分類アルゴリズムのパフォーマンスを向上させるための一般的な戦略を示します。これらの戦略は3つのドメインでテストされています。最初の領域は、ガスタービンエンジンの診断です。問題は、寒冷地などの別の状況でのみ以前に障害が見られた場合に、暖気などの1つの状況で障害のあるエンジンを診断することです。 2番目のドメインは音声認識です。コンテキストは、話者のアイデンティティによって与えられます。問題は、トレーニングセットで表されていない、新しいスピーカーが話す言葉を認識することです。 3番目の領域は医学的予後です。問題は、肝炎患者が生きるか死ぬかを予測することです。コンテキストは、患者の年齢です。 3つのドメインすべてで、コンテキストを活用することで、分類が大幅に正確になります。

This paper addresses the problem of classifying observations when features are context-sensitive, especially when the testing set involves a context that is different from the training set. The paper begins with a precise definition of the problem, then general strategies are presented for enhancing the performance of classification algorithms on this type of problem. These strategies are tested on three domains. The first domain is the diagnosis of gas turbine engines. The problem is to diagnose a faulty engine in one context, such as warm weather, when the fault has previously been seen only in another context, such as cold weather. The second domain is speech recognition. The context is given by the identity of the speaker. The problem is to recognize words spoken by a new speaker, not represented in the training set. The third domain is medical prognosis. The problem is to predict whether a patient with hepatitis will live or die. The context is the age of the patient. For all three domains, exploiting context results in substantially more accurate classification.

updated: Thu Dec 12 2002 19:26:52 GMT+0000 (UTC)

published: Thu Dec 12 2002 19:26:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト