ESFPNet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video

Qi Chang; Danish Ahmad; Jennifer Toth; Rebecca Bascom; William E. Higgins

ESFPNet: 自家蛍光気管支鏡ビデオにおけるリアルタイム病変セグメンテーションのための効率的なディープラーニングアーキテクチャ

肺がんは進行した段階で発見される傾向があり、患者の死亡率が高くなります。気管支鏡検査は、肺がんの早期症状 (気管支病変) を検出する効果的な非侵襲的方法として最適な方法です。特に、自己蛍光気管支鏡検査 (AFB) は、正常組織 (緑) と病変組織 (赤褐色) の自己蛍光特性を異なる色で識別します。最近の研究では、病変の検索における AFB の感度が高いことが示されているため、気管支鏡による気道検査において、AFB は潜在的に極めて重要な方法になっています。残念なことに、AFB ビデオの手動検査は非常に退屈でエラーが発生しやすい一方で、潜在的により堅牢な自動 AFB 病変分析に向けて限られた努力が費やされてきました。 AFB ビデオストリームの気管支病変の正確なセグメンテーションと堅牢な検出のために、ESFPNet と呼ばれるリアルタイム (27 フレーム/秒の処理スループット) ディープラーニングアーキテクチャを提案します。このアーキテクチャは、事前トレーニング済みの Mix Transformer (MiT) エンコーダーと効率的なステージワイズ機能ピラミッド (ESFP) デコーダー構造を利用するエンコーダー構造を特徴としています。 20 人の肺がん患者の AFB 気道検査ビデオからのセグメンテーション結果は、私たちのアプローチが平均 Dice index = 0.756 および平均 Intersection of Union = 0.624 を与えることを示しており、他の最近のアーキテクチャによって生成された結果よりも優れています。このように、ESFPNet は医師に気管支鏡によるライブ気道検査中に自信を持ってリアルタイムで病変をセグメンテーションおよび検出するための潜在的なツールを提供します。さらに、私たちのモデルは、CVC-ClinicDB、ETIS-LaribPolypDB データセットでの最先端 (SOTA) のパフォーマンス、および Kvasir、CVC-ColonDB データセットでの優れたパフォーマンスによって証明されるように、他のドメインへの有望な潜在的な適用性を示しています。

Lung cancer tends to be detected at an advanced stage, resulting in a high patient mortality rate. Thus, much recent research has focused on early disease detection Bronchoscopy is the procedure of choice for an effective noninvasive way of detecting early manifestations (bronchial lesions) of lung cancer. In particular, autofluorescence bronchoscopy (AFB) discriminates the autofluorescence properties of normal (green) and diseased tissue (reddish brown) with different colors. Because recent studies show AFB's high sensitivity in searching lesions, it has become a potentially pivotal method in bronchoscopic airway exams. Unfortunately, manual inspection of AFB video is extremely tedious and error prone, while limited effort has been expended toward potentially more robust automatic AFB lesion analysis. We propose a real-time (processing throughput of 27 frames/sec) deep-learning architecture dubbed ESFPNet for accurate segmentation and robust detection of bronchial lesions in AFB video streams. The architecture features an encoder structure that exploits pretrained Mix Transformer (MiT) encoders and an efficient stage-wise feature pyramid (ESFP) decoder structure. Segmentation results from the AFB airway-exam videos of 20 lung cancer patients indicate that our approach gives a mean Dice index = 0.756 and an average Intersection of Union = 0.624, results that are superior to those generated by other recent architectures. Thus, ESFPNet gives the physician a potential tool for confident real-time lesion segmentation and detection during a live bronchoscopic airway exam. Moreover, our model shows promising potential applicability to other domains, as evidenced by its state-of-the-art (SOTA) performance on the CVC-ClinicDB, ETIS-LaribPolypDB datasets, and superior performance on the Kvasir, CVC-ColonDB datasets.

updated: Fri Dec 09 2022 04:21:41 GMT+0000 (UTC)

published: Fri Jul 15 2022 21:21:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト