MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets

Siyi Du; Nourhan Bayasi; Ghassan Hamarneh; Rafeef Garbi

MDViT: 小規模な医療画像セグメンテーションデータセット用のマルチドメインビジョントランスフォーマー

医療画像セグメンテーション (MIS) は、その臨床的有用性にも関わらず、画像固有の複雑さと変動性のため、依然として困難な作業です。ビジョントランスフォーマー (ViT) は、MIS を改善するための有望なソリューションとして最近登場しました。ただし、畳み込みニューラルネットワークよりも大規模なトレーニングデータセットが必要です。この障害を克服するために、データ効率の高い ViT が提案されましたが、通常は単一のデータソースを使用してトレーニングされるため、他の利用可能なデータセットから活用できる貴重な知識が見落とされます。異なるドメインのデータセットを単純に組み合わせると、負の知識伝達 (NKT)、つまり無視できないドメイン間の異質性により一部のドメインでのモデルのパフォーマンスが低下する可能性があります。この論文では、複数の小さなデータリソース (ドメイン) の知識を適応的に活用することでデータ飢餓を軽減し、NKT に対抗するドメインアダプターを含む初のマルチドメイン ViT である MDViT を提案します。さらに、ドメイン間の表現学習を強化するために、ユニバーサルネットワーク (すべてのドメインにまたがる) と補助的なドメイン固有のブランチの間で知識を転送する相互知識蒸留パラダイムを統合します。 4 つの皮膚病変セグメンテーションデータセットの実験では、より多くのドメインが追加された場合でも、MDViT が優れたセグメンテーションパフォーマンスと固定モデルサイズにより、推論時に最先端のアルゴリズムを上回っていることが示されています。私たちのコードは https://github.com/siyi-wind/MDViT で入手できます。

Despite its clinical utility, medical image segmentation (MIS) remains a daunting task due to images' inherent complexity and variability. Vision transformers (ViTs) have recently emerged as a promising solution to improve MIS; however, they require larger training datasets than convolutional neural networks. To overcome this obstacle, data-efficient ViTs were proposed, but they are typically trained using a single source of data, which overlooks the valuable knowledge that could be leveraged from other available datasets. Naivly combining datasets from different domains can result in negative knowledge transfer (NKT), i.e., a decrease in model performance on some domains with non-negligible inter-domain heterogeneity. In this paper, we propose MDViT, the first multi-domain ViT that includes domain adapters to mitigate data-hunger and combat NKT by adaptively exploiting knowledge in multiple small data resources (domains). Further, to enhance representation learning across domains, we integrate a mutual knowledge distillation paradigm that transfers knowledge between a universal network (spanning all the domains) and auxiliary domain-specific branches. Experiments on 4 skin lesion segmentation datasets show that MDViT outperforms state-of-the-art algorithms, with superior segmentation performance and a fixed model size, at inference time, even as more domains are added. Our code is available at https://github.com/siyi-wind/MDViT.

updated: Fri Jun 07 2024 08:44:54 GMT+0000 (UTC)

published: Wed Jul 05 2023 08:19:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト