EGFR Mutation Prediction of Lung Biopsy Images using Deep Learning

Ravi Kant Gupta; Shivani Nandgaonkar; Nikhil Cherian Kurian; Swapnil Rane; Amit Sethi

深層学習を用いた肺生検画像のEGFR変異予測

肺がん治療における標的療法の標準的な診断手順には、組織学的サブタイピングとそれに続く EGFR などの重要なドライバー変異の検出が含まれます。分子プロファイリングによってドライバーの変異を明らかにすることはできますが、そのプロセスには多くの場合、費用と時間がかかります。ディープラーニング指向の画像解析は、スライド全体の画像 (WSI) から直接ドライバーの変異を発見するためのより経済的な代替手段を提供します。この作業では、カスタマイズされた深層学習パイプラインを弱い監督で使用して、腫瘍の検出と組織学的サブタイピングに加えて、ヘマトキシリンおよびエオシン染色された WSI から EGFR 変異の形態学的相関を特定しました。 2 つの肺がんデータセット (TCGA とインドのプライベートデータセット) で厳密な実験とアブレーション研究を実施することにより、パイプラインの有効性を実証します。当社のパイプラインにより、TCGA データセットの腫瘍検出で平均 0.964 の曲線下面積 (AUC) を達成し、腺癌と扁平上皮癌の間の組織学的サブタイピングで 0.942 を達成しました。 EGFR 検出では、TCGA データセットで 0.864、インドのデータセットで 0.783 の平均 AUC を達成しました。主な学習ポイントは次のとおりです。まず、ターゲットデータセットで特徴抽出器を微調整する場合、組織学でトレーニングされた特徴抽出器レイヤーを使用することに特別な利点はありません。第二に、おそらく腫瘍領域を捕捉する高い細胞性を持つパッチを選択することは、腫瘍に隣接する間質に疾患クラスの兆候が存在する可能性があるため、常に役立つとは限りません。

The standard diagnostic procedures for targeted therapies in lung cancer treatment involve histological subtyping and subsequent detection of key driver mutations, such as EGFR. Even though molecular profiling can uncover the driver mutation, the process is often expensive and time-consuming. Deep learning-oriented image analysis offers a more economical alternative for discovering driver mutations directly from whole slide images (WSIs). In this work, we used customized deep learning pipelines with weak supervision to identify the morphological correlates of EGFR mutation from hematoxylin and eosin-stained WSIs, in addition to detecting tumor and histologically subtyping it. We demonstrate the effectiveness of our pipeline by conducting rigorous experiments and ablation studies on two lung cancer datasets - TCGA and a private dataset from India. With our pipeline, we achieved an average area under the curve (AUC) of 0.964 for tumor detection, and 0.942 for histological subtyping between adenocarcinoma and squamous cell carcinoma on the TCGA dataset. For EGFR detection, we achieved an average AUC of 0.864 on the TCGA dataset and 0.783 on the dataset from India. Our key learning points include the following. Firstly, there is no particular advantage of using a feature extractor layers trained on histology, if one is going to fine-tune the feature extractor on the target dataset. Secondly, selecting patches with high cellularity, presumably capturing tumor regions, is not always helpful, as the sign of a disease class may be present in the tumor-adjacent stroma.

updated: Thu Sep 08 2022 12:02:47 GMT+0000 (UTC)

published: Fri Aug 26 2022 08:56:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト