UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer

Haonan Wang; Peng Cao; Jiaqi Wang; Osmar R. Zaiane

UCTransNet：Transformerを使用したチャネルごとの観点からのU-Netのスキップ接続の再考

最新のセマンティックセグメンテーション方法は、エンコーダ-デコーダアーキテクチャを備えたU-Netフレームワークを採用しています。単純なスキップ接続スキームを使用するU-Netにとって、グローバルマルチスケールコンテキストをモデル化することは依然として困難です。1）エンコーダーとデコーダーステージの互換性のない機能セットの問題のため、各スキップ接続設定が有効であるとは限りません。セグメンテーションのパフォーマンスに悪影響を及ぼします。 2）元のU-Netは、一部のデータセットでスキップ接続がないものよりも劣っています。私たちの調査結果に基づいて、注意メカニズムを備えたチャネルの観点から、UCTransNet（U-Netで提案されたCTransモジュールを使用）という名前の新しいセグメンテーションフレームワークを提案します。具体的には、CTransモジュールはU-Netスキップ接続の代替であり、Transformerとのマルチスケールチャネルクロスフュージョンを実行するサブモジュール（CCTという名前）とサブモジュールのチャネルごとのクロスアテンション（CCTという名前）で構成されます。 CCAという名前）は、あいまいさを排除するためにデコーダー機能に効果的に接続するために、融合されたマルチスケールチャネルごとの情報をガイドします。したがって、CCTとCCAで構成される提案された接続は、元のスキップ接続を置き換えて、正確な自動医療画像セグメンテーションのセマンティックギャップを解決することができます。実験結果は、UCTransNetがより正確なセグメンテーションパフォーマンスを生成し、トランスフォーマーまたはU字型フレームワークを含むさまざまなデータセットおよび従来のアーキテクチャにわたるセマンティックセグメンテーションの最先端を超える一貫した改善を達成することを示唆しています。コード：https：//github.com/McGregorWwww/UCTransNet。

Most recent semantic segmentation methods adopt a U-Net framework with an encoder-decoder architecture. It is still challenging for U-Net with a simple skip connection scheme to model the global multi-scale context: 1) Not each skip connection setting is effective due to the issue of incompatible feature sets of encoder and decoder stage, even some skip connection negatively influence the segmentation performance; 2) The original U-Net is worse than the one without any skip connection on some datasets. Based on our findings, we propose a new segmentation framework, named UCTransNet (with a proposed CTrans module in U-Net), from the channel perspective with attention mechanism. Specifically, the CTrans module is an alternate of the U-Net skip connections, which consists of a sub-module to conduct the multi-scale Channel Cross fusion with Transformer (named CCT) and a sub-module Channel-wise Cross-Attention (named CCA) to guide the fused multi-scale channel-wise information to effectively connect to the decoder features for eliminating the ambiguity. Hence, the proposed connection consisting of the CCT and CCA is able to replace the original skip connection to solve the semantic gaps for an accurate automatic medical image segmentation. The experimental results suggest that our UCTransNet produces more precise segmentation performance and achieves consistent improvements over the state-of-the-art for semantic segmentation across different datasets and conventional architectures involving transformer or U-shaped framework. Code: https://github.com/McGregorWwww/UCTransNet.

updated: Fri Dec 03 2021 13:50:09 GMT+0000 (UTC)

published: Thu Sep 09 2021 15:18:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト