Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation

Ange Lou; Kareem Tawfik; Xing Yao; Ziteng Liu; Jack Noble

最小-最大類似度: 手術器具のセグメンテーションのための対照的な半教師あり深層学習ネットワーク

ニューラルネットワークを使用した医用画像のセグメンテーションに関する一般的な問題は、トレーニング用の多数のピクセルレベルの注釈付きデータを取得するのが難しいことです。この問題に対処するために、対照学習に基づく半教師付きセグメンテーションネットワークを提案しました。以前の最先端技術とは対照的に、最小-最大類似性 (MMS) を導入します。これは、分類器とプロジェクターを使用してすべての負、および正と負の特徴ペアを構築することにより、デュアルビュートレーニングの対照的な学習形式です。それぞれ、学習を MMS 問題の解決として定式化します。すべて負のペアは、さまざまなビューから学習するネットワークを監視し、一般的な特徴をキャプチャするために使用されます。ラベルのない予測の一貫性は、正と負のペア間のピクセル単位のコントラスト損失によって測定されます。提案された方法を定量的および定性的に評価するために、4 つの公開内視鏡手術ツールセグメンテーションデータセットと、手動で注釈を付けた 1 つの人工内耳手術データセットでテストします。結果は、提案された方法が、最先端の半教師ありおよび完全教師ありのセグメンテーションアルゴリズムよりも一貫して優れていることを示しています。また、当社の半教師付きセグメンテーションアルゴリズムは、未知の手術ツールを正しく認識し、適切な予測を提供できます。また、MMS アプローチは、約 40 フレーム/秒 (fps) の推論速度を達成でき、リアルタイムのビデオセグメンテーションを処理するのに適しています。

A common problem with segmentation of medical images using neural networks is the difficulty to obtain a significant number of pixel-level annotated data for training. To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning. In contrast to the previous state-of-the-art, we introduce Min-Max Similarity (MMS), a contrastive learning form of dual-view training by employing classifiers and projectors to build all-negative, and positive and negative feature pairs, respectively, to formulate the learning as solving a MMS problem. The all-negative pairs are used to supervise the networks learning from different views and to capture general features, and the consistency of unlabeled predictions is measured by pixel-wise contrastive loss between positive and negative pairs. To quantitatively and qualitatively evaluate our proposed method, we test it on four public endoscopy surgical tool segmentation datasets and one cochlear implant surgery dataset, which we manually annotated. Results indicate that our proposed method consistently outperforms state-of-the-art semi-supervised and fully supervised segmentation algorithms. And our semi-supervised segmentation algorithm can successfully recognize unknown surgical tools and provide good predictions. Also, our MMS approach could achieve inference speeds of about 40 frames per second (fps) and is suitable to deal with the real-time video segmentation.

updated: Wed Feb 22 2023 17:41:24 GMT+0000 (UTC)

published: Tue Mar 29 2022 01:40:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト