Bidirectional Semi-supervised Dual-branch CNN for Robust 3D Reconstruction of Stereo Endoscopic Images via Adaptive Cross and Parallel Supervisions

Hongkuan Shi; Zhiwei Wang; Ying Zhou; Dun Li; Xin Yang; Qiang Li

適応クロスおよびパラレル教師によるステレオ内視鏡画像のロバストな 3D 再構成のための双方向半教師ありデュアルブランチ CNN

教師と生徒のネットワークを介した半教師付き学習は、いくつかのラベル付きサンプルで効果的にモデルをトレーニングできます。これにより、学生モデルは、ラベルのない余分なデータに関する教師の予測から知識を引き出すことができます。ただし、このような知識の流れは通常、一方向であり、パフォーマンスは教師モデルの品質に対して脆弱です。この論文では、教師と生徒の両方の役割を同時に果たすことができる 2 人の学習者間の双方向学習の新しい方法を提案することにより、ステレオ内視鏡画像の堅牢な 3D 再構築を目指します。具体的には、二分岐畳み込みニューラルネットワークを学習するために、Adaptive Cross Supervision (ACS) と Adaptive Parallel Supervision (APS) という 2 つの自己監督を導入します。 2 つのブランチは、同じ位置に対して 2 つの異なる視差確率分布を予測し、それらの期待値を視差値として出力します。学習された知識は、クロス方向 (ACS ではディスパリティが分布をガイド) とパラレル方向 (APS ではディスパリティがディスパリティをガイド) の 2 つの方向に沿って枝を横切って流れます。さらに、各ブランチは信頼度を学習して、提供された監督を動的に改善します。 ACS では、予測された視差は単峰性分布に緩和され、信頼度が低いほど分布は滑らかになります。 APS では、信頼度の低いものの重みを下げることで、誤った予測が抑制されます。適応双方向学習により、2 つのブランチは互いに適切に調整された監視を享受し、最終的に一貫性のあるより正確な視差推定に収束します。 3 つの公開データセットに関する大規模で包括的な実験結果は、少なくともそれぞれ 13.95% および 3.90% の平均視差誤差の減少により、完全教師ありおよび半教師ありの最先端技術よりも優れたパフォーマンスを示しています。

Semi-supervised learning via teacher-student network can train a model effectively on a few labeled samples. It enables a student model to distill knowledge from the teacher's predictions of extra unlabeled data. However, such knowledge flow is typically unidirectional, having the performance vulnerable to the quality of teacher model. In this paper, we seek to robust 3D reconstruction of stereo endoscopic images by proposing a novel fashion of bidirectional learning between two learners, each of which can play both roles of teacher and student concurrently. Specifically, we introduce two self-supervisions, i.e., Adaptive Cross Supervision (ACS) and Adaptive Parallel Supervision (APS), to learn a dual-branch convolutional neural network. The two branches predict two different disparity probability distributions for the same position, and output their expectations as disparity values. The learned knowledge flows across branches along two directions: a cross direction (disparity guides distribution in ACS) and a parallel direction (disparity guides disparity in APS). Moreover, each branch also learns confidences to dynamically refine its provided supervisions. In ACS, the predicted disparity is softened into a unimodal distribution, and the lower the confidence, the smoother the distribution. In APS, the incorrect predictions are suppressed by lowering the weights of those with low confidence. With the adaptive bidirectional learning, the two branches enjoy well-tuned supervisions from each other, and eventually converge on a consistent and more accurate disparity estimation. The extensive and comprehensive experimental results on three public datasets demonstrate our superior performance over the fully-supervised and semi-supervised state-of-the-arts with a decrease of averaged disparity error by 13.95% and 3.90% at least, respectively.

updated: Wed Oct 19 2022 09:27:34 GMT+0000 (UTC)

published: Sat Oct 15 2022 13:30:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト