ESFPNet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video

Qi Chang; Danish Ahmad; Jennifer Toth; Rebecca Bascom; William E. Higgins

ESFPNet：自家蛍光気管支鏡ビデオでのリアルタイム病変セグメンテーションのための効率的な深層学習アーキテクチャ

肺がんは進行した段階で検出される傾向があり、その結果、患者の死亡率が高くなります。したがって、最近の研究は、病気の早期発見に焦点を合わせています。肺がんは一般に、気道壁の気管支上皮内に発生する病変として最初に現れます。気管支鏡検査は、効果的な非侵襲的気管支病変の検出に最適な手順です。特に、自家蛍光気管支鏡検査（AFB）は、正常組織と病変組織の自家蛍光特性を識別します。これにより、病変はAFBビデオフレームで赤褐色に見え、正常組織は緑色に見えます。最近の研究では、AFBの高い病変感度の能力が示されているため、早期の肺がん検出のための標準的な気管支鏡気道検査において、AFBは潜在的に極めて重要な方法になっています。残念ながら、AFBビデオの手動検査は非常に面倒でエラーが発生しやすい一方で、より堅牢な自動AFB病変の検出とセグメンテーションに向けて限られた労力が費やされています。 AFBビデオストリームからの気管支病変のロバストな検出とセグメンテーションのためのリアルタイムディープラーニングアーキテクチャESFPNetを提案します。このアーキテクチャは、事前にトレーニングされたMix Transformer（MiT）エンコーダーを活用するエンコーダー構造と、ステージワイズ機能ピラミッド（ESFP）デコーダー構造を備えています。肺がん患者の気道検査から得られたAFBビデオの結果は、27フレーム/秒の処理スループットを持ちながら、私たちのアプローチがそれぞれ0.782と0.658の平均ダイスインデックスとIOU値を与えることを示しています。これらの値は、MixトランスフォーマーまたはCNNベースのエンコーダーを使用する他の競合するアーキテクチャーによって達成される結果よりも優れています。さらに、ETIS-LaribPolypDBデータセットの優れたパフォーマンスは、他のドメインへの潜在的な適用性を示しています。

Lung cancer tends to be detected at an advanced stage, resulting in a high patient mortality rate. Thus, recent research has focused on early disease detection. Lung cancer generally first appears as lesions developing within the bronchial epithelium of the airway walls. Bronchoscopy is the procedure of choice for effective noninvasive bronchial lesion detection. In particular, autofluorescence bronchoscopy (AFB) discriminates the autofluorescence properties of normal and diseased tissue, whereby lesions appear reddish brown in AFB video frames, while normal tissue appears green. Because recent studies show AFB's ability for high lesion sensitivity, it has become a potentially pivotal method during the standard bronchoscopic airway exam for early-stage lung cancer detection. Unfortunately, manual inspection of AFB video is extremely tedious and error-prone, while limited effort has been expended toward potentially more robust automatic AFB lesion detection and segmentation. We propose a real-time deep learning architecture ESFPNet for robust detection and segmentation of bronchial lesions from an AFB video stream. The architecture features an encoder structure that exploits pretrained Mix Transformer (MiT) encoders and a stage-wise feature pyramid (ESFP) decoder structure. Results from AFB videos derived from lung cancer patient airway exams indicate that our approach gives mean Dice index and IOU values of 0.782 and 0.658, respectively, while having a processing throughput of 27 frames/sec. These values are superior to results achieved by other competing architectures that use Mix transformers or CNN-based encoders. Moreover, the superior performance on the ETIS-LaribPolypDB dataset demonstrates its potential applicability to other domains.

updated: Thu Aug 25 2022 20:28:21 GMT+0000 (UTC)

published: Fri Jul 15 2022 21:21:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト