FCB-SwinV2 Transformer for Polyp Segmentation

Kerr Fitzgerald; Bogdan Matuszewski

ポリープセグメンテーション用の FCB-SwinV2 トランスフォーマー

ディープラーニングモデルを使用した大腸内視鏡検査ビデオフレーム内のポリープセグメンテーションには、臨床医のワークフローを自動化する可能性があります。これは、結腸直腸癌に進行する可能性のあるポリープの早期発見率と特徴付けを改善するのに役立ちます。最近の最先端のディープラーニングポリープセグメンテーションモデルは、完全畳み込みネットワークアーキテクチャと並列に動作するトランスフォーマーネットワークアーキテクチャの出力を組み合わせています。この論文では、現在の最先端のポリープセグメンテーションモデル FCBFormer の修正を提案します。 FCBFormer のトランスフォーマーアーキテクチャは SwinV2 Transformer-UNET に置き換えられ、FCB-SwinV2 トランスフォーマーを作成するために、完全畳み込みネットワークアーキテクチャに小さな変更が加えられています。 FCB-SwinV2 Transformer のパフォーマンスは、一般的な大腸内視鏡セグメンテーションベンチマークデータセットである Kvasir-SEG および CVC-ClinicDB で評価されます。一般化可能性テストも実施されます。 FCB-SwinV2 Transformer は、実施されたすべてのテストで一貫してより高い mDice スコアを達成することができ、したがって新しい最先端のパフォーマンスを表しています。大腸内視鏡セグメンテーションモデルのパフォーマンスが文献内でどのように評価されるかについて見つかった問題も報告され、議論されます。特定された最も重要な問題の 1 つは、CVC-ClinicDB データセットでパフォーマンスを評価する場合、トレーニング/検証/テストデータパーティション中にビデオシーケンスからのデータ漏洩が発生しないようにすることが望ましいということです。

Polyp segmentation within colonoscopy video frames using deep learning models has the potential to automate the workflow of clinicians. This could help improve the early detection rate and characterization of polyps which could progress to colorectal cancer. Recent state-of-the-art deep learning polyp segmentation models have combined the outputs of Fully Convolutional Network architectures and Transformer Network architectures which work in parallel. In this paper we propose modifications to the current state-of-the-art polyp segmentation model FCBFormer. The transformer architecture of the FCBFormer is replaced with a SwinV2 Transformer-UNET and minor changes to the Fully Convolutional Network architecture are made to create the FCB-SwinV2 Transformer. The performance of the FCB-SwinV2 Transformer is evaluated on the popular colonoscopy segmentation bench-marking datasets Kvasir-SEG and CVC-ClinicDB. Generalizability tests are also conducted. The FCB-SwinV2 Transformer is able to consistently achieve higher mDice scores across all tests conducted and therefore represents new state-of-the-art performance. Issues found with how colonoscopy segmentation model performance is evaluated within literature are also re-ported and discussed. One of the most important issues identified is that when evaluating performance on the CVC-ClinicDB dataset it would be preferable to ensure no data leakage from video sequences occurs during the training/validation/test data partition.

updated: Thu Feb 02 2023 11:42:26 GMT+0000 (UTC)

published: Thu Feb 02 2023 11:42:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト