Distance Metric-Based Learning with Interpolated Latent Features for Location Classification in Endoscopy Image and Video

Mohammad Reza Mohebbian; Khan A. Wahid; Anh Dinh; Paul Babyn

内視鏡画像およびビデオにおける位置分類のための補間された潜在特徴を用いた距離メトリックベースの学習

従来の内視鏡検査（CE）およびワイヤレスカプセル内視鏡検査（WCE）は、胃腸（GI）管障害を診断するための既知のツールです。消化管の解剖学的位置を検出することは、臨床医がより適切な治療計画を決定するのに役立ち、反復的な内視鏡検査を減らすことができ、ドラッグデリバリーにおいて重要です。主にデータの収集と注釈付けが難しいため、分類を使用してWCEおよびCE画像の解剖学的位置を検出することに取り組む研究はほとんどありません。この研究では、内視鏡フレームをローカライズするための転送学習と多様体混合スキームを組み合わせ、いくつかのサンプルでトレーニングできる距離計量学習に基づく数ショット学習方法を提示します。多様体混合プロセスは、過剰適合を減らしながらトレーニングエポックの数を増やし、より正確な決定境界を提供することにより、数ショットの学習を改善します。データセットは、人間の消化管の10の異なる解剖学的位置から収集されます。 2つのモデルは、CEとWCEからそれぞれ25700と1825のビデオフレームの位置を予測するために、78CEと27WCEの注釈付きフレームのみを使用してトレーニングされました。さらに、9人の消化器専門医を使用して主観的な評価を行い、ローカリゼーションのためのAIシステムの必要性を示しました。さまざまなアブレーション研究と解釈が実行され、各ステップの重要性、伝達学習アプローチの効果、およびパフォーマンスに対する多様体の取り違えの影響が示されます。提案された方法はまた、カテゴリーのクロスエントロピー損失について訓練された様々な方法と比較され、提案された方法が内視鏡画像分類に使用される可能性があることを示すより良い結果を生み出した。

Conventional Endoscopy (CE) and Wireless Capsule Endoscopy (WCE) are known tools for diagnosing gastrointestinal (GI) tract disorders. Detecting the anatomical location of GI tract can help clinicians to determine a more appropriate treatment plan, can reduce repetitive endoscopy and is important in drug-delivery. There are few research that address detecting anatomical location of WCE and CE images using classification, mainly because of difficulty in collecting data and anotating them. In this study, we present a few-shot learning method based on distance metric learning which combines transfer-learning and manifold mixup scheme for localizing endoscopy frames and can be trained on few samples. The manifold mixup process improves few-shot learning by increasing the number of training epochs while reducing overfitting, as well as providing more accurate decision boundaries. A dataset is collected from 10 different anatomical positions of human GI tract. Two models were trained using only 78 CE and 27 WCE annotated frames to predict the location of 25700 and 1825 video frames from CE and WCE, respectively. In addition, we performed subjective evaluation using nine gastroenterologists to show the necessaity of having an AI system for localization. Various ablation studies and interpretations are performed to show the importance of each step, such effect of transfer-learning approach, and impact of manifold mixup on performance. The proposed method is also compared with various methods trained on categorical cross-entropy loss and produced better results which show that proposed method has potential to be used for endoscopy image classification.

updated: Thu Aug 19 2021 19:08:55 GMT+0000 (UTC)

published: Mon Mar 15 2021 16:24:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト