ADS_UNet: A Nested UNet for Histopathology Image Segmentation

Yilong Yang; Srinandan Dasmahapatra; Sasan Mahmoodi

ADS_UNet: 組織病理画像セグメンテーション用のネストされた UNet

UNet モデルは、コントラクトエンコーダーとアップサンプリングデコーダーマップとして配置された完全畳み込みネットワーク (FCN) レイヤーで構成されます。これらのエンコーダーマップとデコーダーマップのネストされた配置により、UNete や UNet++ などの UNet モデルの拡張が生じます。その他の改良点には、畳み込み層の出力を制限して、エンドツーエンドでトレーニングしたときにセグメントラベルを区別することが含まれます。これは、深い監視と呼ばれるプロパティです。これにより、パラメーター空間が大きいにもかかわらず、これらの入れ子になった UNet モデルの機能の多様性が減少します。さらに、テクスチャセグメンテーションでは、複数のスケールでのピクセル相関が分類タスクに寄与します。したがって、浅いレイヤーの明示的な深い監視は、パフォーマンスを向上させる可能性があります。このホワイトペーパーでは、ADS UNet を提案します。これは、リソース効率の高い深い監視を浅いレイヤーに組み込み、サブ UNet のパフォーマンスで重み付けされた組み合わせを使用してセグメンテーションモデルを作成する段階的な追加トレーニングアルゴリズムです。提案されたADS UNetが構成要素間の相関関係を減らし、リソース効率を高めながらパフォーマンスを向上させるという主張をサポートするために、3つの組織病理学データセットに関する経験的証拠を提供します。 ADS_UNet は、CRAG および BCSS データセットで最先端の Transformer ベースのモデルよりも 1.08 ポイントおよび 0.6 ポイント優れていることを示していますが、必要な GPU 消費は 37% のみであり、トレーニング時間は 34% しか必要ありません。

The UNet model consists of fully convolutional network (FCN) layers arranged as contracting encoder and upsampling decoder maps. Nested arrangements of these encoder and decoder maps give rise to extensions of the UNet model, such as UNete and UNet++. Other refinements include constraining the outputs of the convolutional layers to discriminate between segment labels when trained end to end, a property called deep supervision. This reduces feature diversity in these nested UNet models despite their large parameter space. Furthermore, for texture segmentation, pixel correlations at multiple scales contribute to the classification task; hence, explicit deep supervision of shallower layers is likely to enhance performance. In this paper, we propose ADS UNet, a stage-wise additive training algorithm that incorporates resource-efficient deep supervision in shallower layers and takes performance-weighted combinations of the sub-UNets to create the segmentation model. We provide empirical evidence on three histopathology datasets to support the claim that the proposed ADS UNet reduces correlations between constituent features and improves performance while being more resource efficient. We demonstrate that ADS_UNet outperforms state-of-the-art Transformer-based models by 1.08 and 0.6 points on CRAG and BCSS datasets, and yet requires only 37% of GPU consumption and 34% of training time as that required by Transformers.

updated: Mon Apr 10 2023 13:08:48 GMT+0000 (UTC)

published: Mon Apr 10 2023 13:08:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト