One-Shot Object Localization in Medical Images based on Relative Position Regression

Wenhui Lei; Wei Xu; Ran Gu; Hao Fu; Shaoting Zhang; Guotai Wang

相対位置回帰に基づく医用画像のワンショットオブジェクトローカリゼーション

深層学習ネットワークは、内側の画像でオブジェクトを正確にローカライズするための有望なパフォーマンスを示していますが、教師ありトレーニングには大量の注釈付きデータが必要であり、費用がかかり、専門知識が面倒です。この問題に対処するために、ボリューム医療画像での臓器とランドマークのローカリゼーションのワンショットフレームワークを提示します。これは、トレーニング段階で注釈を必要とせず、サポートがあればテスト画像内のランドマークまたは臓器を見つけるために使用できます（参照）推論段階での画像。私たちの主なアイデアは、さまざまな人体の組織や臓器が同様の相対的な位置とコンテキストを持っているということから来ています。したがって、非局所パッチの相対位置を予測して、標的臓器を特定することができます。私たちのフレームワークは3つの部分で構成されています。（1）同じボリュームからの任意の2つのパッチ間の3Dオフセットを予測するようにトレーニングされた投影ネットワーク。人間の注釈は必要ありません。推論段階では、参照画像内の1つの特定のランドマークをサポートパッチとして受け取り、ランダムパッチからテスト（クエリ）ボリューム内の対応するランドマークへのオフセットを予測します。（2）粗いフレームワークから細かいフレームワークには、2つの投影ネットワークが含まれており、ターゲットのより正確なローカリゼーションを提供します。（3）粗いモデルから細かいモデルに基づいて、臓器境界ボックス（Bボックス）の検出を、クエリボリュームのx、y、z方向に沿った6つの極値点の特定に転送します。頭頸部（HaN）CTボリュームからの多臓器局在化に関する実験は、私たちの方法がリアルタイムで競争力のあるパフォーマンスを獲得したことを示しました。これは、同じ設定のテンプレートマッチング方法よりも正確で10 ^ 5倍高速です。コードが利用可能です：https：//github.com/LWHYC/RPR-Loc。

Deep learning networks have shown promising performance for accurate object localization in medial images, but require large amount of annotated data for supervised training, which is expensive and expertise burdensome. To address this problem, we present a one-shot framework for organ and landmark localization in volumetric medical images, which does not need any annotation during the training stage and could be employed to locate any landmarks or organs in test images given a support (reference) image during the inference stage. Our main idea comes from that tissues and organs from different human bodies have a similar relative position and context. Therefore, we could predict the relative positions of their non-local patches, thus locate the target organ. Our framework is composed of three parts: (1) A projection network trained to predict the 3D offset between any two patches from the same volume, where human annotations are not required. In the inference stage, it takes one given landmark in a reference image as a support patch and predicts the offset from a random patch to the corresponding landmark in the test (query) volume. (2) A coarse-to-fine framework contains two projection networks, providing more accurate localization of the target. (3) Based on the coarse-to-fine model, we transfer the organ boundingbox (B-box) detection to locating six extreme points along x, y and z directions in the query volume. Experiments on multi-organ localization from head-and-neck (HaN) CT volumes showed that our method acquired competitive performance in real time, which is more accurate and 10^5 times faster than template matching methods with the same setting. Code is available: https://github.com/LWHYC/RPR-Loc.

updated: Sun Dec 13 2020 11:54:19 GMT+0000 (UTC)

published: Sun Dec 13 2020 11:54:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト