Advancing 3D Medical Image Analysis with Variable Dimension Transform based Supervised 3D Pre-training

Shu Zhang; Zihao Li; Hong-Yu Zhou; Jiechao Ma; Yizhou Yu

可変次元変換ベースの監視付き3D事前トレーニングによる3D医療画像分析の進歩

データ取得と注釈の両方の難しさは、3D医用画像アプリケーションのトレーニングデータセットのサンプルサイズを大幅に制限します。その結果、十分な事前トレーニングパラメータがない場合、高性能の3D畳み込みニューラルネットワークをゼロから構築することは依然として困難な作業です。 3D事前トレーニングに関するこれまでの取り組みは、ラベルなしデータの予測学習または対照学習を使用して不変の3D表現を構築する、自己監視アプローチに依存することがよくありました。ただし、大規模な監視情報が利用できないため、これらの学習フレームワークから意味的に不変で識別可能な表現を取得することには問題があります。この論文では、大規模な2D自然画像データセットからのセマンティック監視を活用するために、革新的でありながら完全に監視された3Dネットワーク事前トレーニングフレームワークを再検討します。再設計された3Dネットワークアーキテクチャでは、再構成された自然画像を使用して、データ不足の問題に対処し、強力な3D表現を開発します。 4つのベンチマークデータセットでの包括的な実験は、提案された事前トレーニング済みモデルが、分類、セグメンテーション、検出などのさまざまな3D医用画像タスクの精度を向上させながら収束を効果的に加速できることを示しています。さらに、最初からトレーニングする場合と比較して、注釈の労力を最大60％節約できます。 NIH DeepLesionデータセットでは、同様に最先端の検出パフォーマンスを実現し、以前の自己監視および完全監視の事前トレーニングアプローチ、およびゼロからトレーニングを行う方法を上回ります。 3D医療モデルのさらなる開発を促進するために、コードと事前トレーニング済みのモデルの重みがhttps://github.com/urmagicsmine/CSPRで公開されています。

The difficulties in both data acquisition and annotation substantially restrict the sample sizes of training datasets for 3D medical imaging applications. As a result, constructing high-performance 3D convolutional neural networks from scratch remains a difficult task in the absence of a sufficient pre-training parameter. Previous efforts on 3D pre-training have frequently relied on self-supervised approaches, which use either predictive or contrastive learning on unlabeled data to build invariant 3D representations. However, because of the unavailability of large-scale supervision information, obtaining semantically invariant and discriminative representations from these learning frameworks remains problematic. In this paper, we revisit an innovative yet simple fully-supervised 3D network pre-training framework to take advantage of semantic supervisions from large-scale 2D natural image datasets. With a redesigned 3D network architecture, reformulated natural images are used to address the problem of data scarcity and develop powerful 3D representations. Comprehensive experiments on four benchmark datasets demonstrate that the proposed pre-trained models can effectively accelerate convergence while also improving accuracy for a variety of 3D medical imaging tasks such as classification, segmentation and detection. In addition, as compared to training from scratch, it can save up to 60% of annotation efforts. On the NIH DeepLesion dataset, it likewise achieves state-of-the-art detection performance, outperforming earlier self-supervised and fully-supervised pre-training approaches, as well as methods that do training from scratch. To facilitate further development of 3D medical models, our code and pre-trained model weights are publicly available at https://github.com/urmagicsmine/CSPR.

updated: Wed Jan 05 2022 03:11:21 GMT+0000 (UTC)

published: Wed Jan 05 2022 03:11:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト