DeepPyram: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

Negin Ghamsarian; Mario Taschwer; klaus Schoeffmann

DeepPyram：白内障手術ビデオのセマンティックセグメンテーションのためのピラミッドビューと変形可能なピラミッド受信の有効化

白内障手術におけるセマンティックセグメンテーションには、手術結果の向上と臨床リスクの低減に寄与する幅広い用途があります。ただし、関連するさまざまなインスタンスをセグメント化する際のさまざまな問題により、一意のネットワークの指定は非常に困難になります。この論文では、さまざまな問題を抱える白内障手術ビデオの関連オブジェクトをセグメント化する際に優れたパフォーマンスを実現できる、DeepPyramと呼ばれるセマンティックセグメンテーションネットワークを提案します。この優位性は、主に次の3つのモジュールに由来します。（i）ピラミッドビューフュージョン。入力畳み込み特徴マップの各ピクセル位置を中心とする周囲領域のさまざまな角度のグローバルビューを提供します。（ii）変形可能なピラミッド受容。これにより、対象のオブジェクトの幾何学的変換に適応できる、広く変形可能な受容野が可能になります。（iii）マルチスケールセマンティック機能マップを適応的に監視するピラミッド損失。これらのモジュールは、特にオブジェクトの透明性、変形能、スケーラビリティ、および鈍いエッジの場合に、セマンティックセグメンテーションのパフォーマンスを効果的に高めることができます。提案されたアプローチは、異なるコンテキスト機能を持つオブジェクトの白内障手術の4つのデータセットを使用して評価され、13の最先端のセグメンテーションネットワークと比較されます。実験結果は、DeepPyramが、追加のトレーニング可能なパラメーターを課すことなく、ライバルのアプローチよりも優れていることを確認しています。私たちの包括的なアブレーション研究は、提案されたモジュールの有効性をさらに証明します。

Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant instances make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network termed as DeepPyram that can achieve superior performance in segmenting relevant objects in cataract surgery videos with varying issues. This superiority mainly originates from three modules: (i) Pyramid View Fusion, which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (ii) Deformable Pyramid Reception, which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (iii) Pyramid Loss that adaptively supervises multi-scale semantic feature maps. These modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. The proposed approach is evaluated using four datasets of cataract surgery for objects with different contextual features and compared with thirteen state-of-the-art segmentation networks. The experimental results confirm that DeepPyram outperforms the rival approaches without imposing additional trainable parameters. Our comprehensive ablation study further proves the effectiveness of the proposed modules.

updated: Sat Sep 11 2021 19:31:52 GMT+0000 (UTC)

published: Sat Sep 11 2021 19:31:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト