2.75D: Boosting Learning Efficiency and Capability by Representing 3D Features in 2D

Ruisheng Su; Weiyi Xie; Tao Tan

2.75D：2Dで3D機能を表現することにより、学習効率と機能を向上させる

医療イメージングでは、3D畳み込みニューラルネットワーク（CNN）は、高次元入力を使用する多数の深層学習タスクで2D CNNよりも優れたパフォーマンスを示し、特徴表現における3D空間情報の付加価値を証明しています。ただし、3D CNNでは、収束するためにより多くのトレーニングサンプルが必要であり、トレーニングと推論の両方でより多くの計算リソースと実行時間が必要です。一方、3D CNNに転移学習を適用することは、公的に利用可能な事前トレーニング済みの3Dネットワークがないために困難です。これらの問題に取り組むために、ボリュームデータの新しい2D戦略的表現、つまり2.75Dアプローチを提案します。私たちの方法では、3D画像の空間情報は、スパイラルスピニング技術によって単一の2Dビューでキャプチャされました。したがって、私たちのCNNは本質的に2Dネットワークであり、事前にトレーニングされた2DCNNをダウンストリームの視覚問題に完全に活用できます。提案された2.75D法を、結節の偽陽性率の低下における2D、2.5D、3Dの対応物と比較することにより、LUNA16結節検出チャレンジで提案された方法を評価しました。結果は、すべてのメソッドが最初からトレーニングされた場合、提案されたメソッドが他のメソッドよりも優れていることを示しています。このようなパフォーマンスの向上は、転移学習を導入する場合、またはトレーニングデータが限られている場合にさらに顕著になります。さらに、私たちの方法は、3D法と比較して、トレーニングと推論の時間消費を大幅に削減します。私たちのコードは公開されます。

In medical imaging, 3D convolutional neural networks (CNNs) have shown superior performance to 2D CNN in numerous deep learning tasks with high dimensional input, proving the added value of 3D spatial information in feature representation. However, 3D CNN requires more training samples to converge, and more computational resources and execution time for both training and inference. Meanwhile, applying transfer learning on 3D CNN is challenging due to a lack of publicly available pre-trained 3D networks. To tackle with these issues, we propose a novel 2D strategical representation of volumetric data, namely 2.75D approach. In our method, the spatial information of 3D images was captured in a single 2D view by a spiral-spinning technique. Therefore, our CNN is intrinsically a 2D network, which can fully leverage pre-trained 2D CNNs for downstream vision problems. We evaluated the proposed method on LUNA16 nodule detection challenge, by comparing the proposed 2.75D method with 2D, 2.5D, 3D counterparts in the nodule false positive reduction. Results show that the proposed method outperforms other counterparts when all methods were trained from scratch. Such performance gain is more pronounced when introducing transfer learning or when training data is limited. In addition, our method achieves a substantial reduce in time consumption of training and inference comparing with the 3D method. Our code will be publicly available.

updated: Wed Nov 25 2020 19:44:20 GMT+0000 (UTC)

published: Tue Feb 11 2020 08:24:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト