Self-Knowledge Distillation for Surgical Phase Recognition

Jinglu Zhang; Santiago Barbarisi; Abdolrahim Kadkhodamohammadi; Danail Stoyanov; Imanol Luengo

手術段階認識のための自己知識蒸留

目的: 手術段階認識の進歩は、一般に、より深いネットワークのトレーニングによって導かれます。より複雑なソリューションをさらに進めるよりも、現在のモデルをより有効に活用できると考えています。私たちは、モデルやアノテーションに特別な複雑さを必要とせずに、現在の最先端 (SOTA) モデルに統合できる自己知識蒸留フレームワークを提案します。方法: 知識の蒸留は、教師のネットワークから生徒のネットワークに知識が蒸留されるネットワーク正則化のフレームワークです。自己知識の蒸留では、ネットワークがそれ自体から学習するように、学生モデルが教師になります。ほとんどの位相認識モデルは、エンコーダ/デコーダのフレームワークに従います。私たちのフレームワークは両方の段階で自己知識の蒸留を利用します。教師モデルは、エンコーダーから強化された特徴表現を抽出し、オーバーセグメンテーションの問題に取り組むためのより堅牢な時間デコーダーを構築するために、生徒モデルのトレーニングプロセスをガイドします。結果: 提案したフレームワークを公開データセット Cholec80 で検証します。私たちのフレームワークは 4 つの一般的な SOTA アプローチの上に組み込まれており、それらのパフォーマンスを一貫して向上させます。具体的には、当社の最高の GRU モデルは、同じベースラインモデルと比較して、精度が +3.33%、F1 スコアが +3.95% 向上し、パフォーマンスが向上します。結論: 自己知識蒸留フレームワークを初めて手術段階認識トレーニングパイプラインに組み込みました。実験結果は、シンプルでありながら強力なフレームワークが既存の位相認識モデルのパフォーマンスを向上できることを示しています。さらに、私たちの広範な実験では、トレーニングセットの 75% を使用しても、フルセットでトレーニングされた同じベースラインモデルと同等のパフォーマンスを達成できることが示されています。

Purpose: Advances in surgical phase recognition are generally led by training deeper networks. Rather than going further with a more complex solution, we believe that current models can be exploited better. We propose a self-knowledge distillation framework that can be integrated into current state-of-the-art (SOTA) models without requiring any extra complexity to the models or annotations. Methods: Knowledge distillation is a framework for network regularization where knowledge is distilled from a teacher network to a student network. In self-knowledge distillation, the student model becomes the teacher such that the network learns from itself. Most phase recognition models follow an encoder-decoder framework. Our framework utilizes self-knowledge distillation in both stages. The teacher model guides the training process of the student model to extract enhanced feature representations from the encoder and build a more robust temporal decoder to tackle the over-segmentation problem. Results: We validate our proposed framework on the public dataset Cholec80. Our framework is embedded on top of four popular SOTA approaches and consistently improves their performance. Specifically, our best GRU model boosts performance by +3.33% accuracy and +3.95% F1-score over the same baseline model. Conclusion: We embed a self-knowledge distillation framework for the first time in the surgical phase recognition training pipeline. Experimental results demonstrate that our simple yet powerful framework can improve performance of existing phase recognition models. Moreover, our extensive experiments show that even with 75% of the training set we still achieve performance on par with the same baseline model trained on the full set.

updated: Thu Jun 15 2023 08:55:00 GMT+0000 (UTC)

published: Thu Jun 15 2023 08:55:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト