Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery

Long Bai; Mobarakol Islam; Hongliang Ren

ロボット手術における視覚的な質問の局所的な回答を継続的に学習するために蒸留を再考する

Visual-Question Localized-Answering (VQLA) システムは、外科教育における知識豊富なアシスタントとして機能します。 VQLA システムは、テキストベースの回答を提供する以外に、関心のある領域を強調表示して、手術場面をよりよく理解することができます。ただし、ディープニューラルネットワーク (DNN) は、新しい知識を学習するときに致命的な忘却に悩まされます。具体的には、DNN が増分クラスまたはタスクを学習すると、古いタスクのパフォーマンスが大幅に低下します。さらに、医療データのプライバシーとライセンスの問題により、継続学習 (CL) モデルを更新するときに古いデータにアクセスすることが困難になることがよくあります。したがって、我々は、逐次学習パラダイムにおける DNN の剛性と可塑性のトレードオフを調査し、バランスをとるために、非模範的な継続的外科 VQLA フレームワークを開発します。 CL タスクにおける蒸留損失を再考し、古い知識を保存するために剛性可塑性を意識した蒸留 (RP-Dist) と自己調整された不均一蒸留 (SH-Dist) を提案します。古いタスクと新しいタスクの間の重みの偏りを調整するために、重み調整 (WA) テクニックも統合されています。さらに、新旧の外科 VQLA タスク間で重複するクラスで構成される外科設定のコンテキストで、3 つの公開外科データセットに関する CL フレームワークを確立します。広範な実験により、私たちの提案した方法が従来の CL 方法と比較して、継続的な外科用 VQLA の学習と忘却を見事に調和させることを実証しました。私たちのコードは公開されています。

The visual-question localized-answering (VQLA) system can serve as a knowledgeable assistant in surgical education. Except for providing text-based answers, the VQLA system can highlight the interested region for better surgical scene understanding. However, deep neural networks (DNNs) suffer from catastrophic forgetting when learning new knowledge. Specifically, when DNNs learn on incremental classes or tasks, their performance on old tasks drops dramatically. Furthermore, due to medical data privacy and licensing issues, it is often difficult to access old data when updating continual learning (CL) models. Therefore, we develop a non-exemplar continual surgical VQLA framework, to explore and balance the rigidity-plasticity trade-off of DNNs in a sequential learning paradigm. We revisit the distillation loss in CL tasks, and propose rigidity-plasticity-aware distillation (RP-Dist) and self-calibrated heterogeneous distillation (SH-Dist) to preserve the old knowledge. The weight aligning (WA) technique is also integrated to adjust the weight bias between old and new tasks. We further establish a CL framework on three public surgical datasets in the context of surgical settings that consist of overlapping classes between old and new surgical VQLA tasks. With extensive experiments, we demonstrate that our proposed method excellently reconciles learning and forgetting on the continual surgical VQLA over conventional CL methods. Our code is publicly accessible.

updated: Sat Jul 22 2023 10:35:25 GMT+0000 (UTC)

published: Sat Jul 22 2023 10:35:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト