Exploiting Segment-level Semantics for Online Phase Recognition from Surgical Videos

Xinpeng Ding; Xiaomeng Li

手術ビデオからのオンライン位相認識のためのセグメントレベルのセマンティクスの活用

自動手術位相認識は、ロボット支援手術で重要な役割を果たします。既存の方法は、フレームごとの情報だけに依存するのではなく、セグメントレベルのセマンティクスを学習することによって手術段階を分類する必要があるという極めて重要な問題を無視していました。この論文では、ビデオからの外科的位相認識のためのセグメントに注意を払う階層的整合性ネットワーク（SAHC）を提示します。重要なアイデアは、階層的な高レベルのセマンティック整合性のあるセグメントを抽出し、それらを使用して、あいまいなフレームによって引き起こされる誤った予測を改善することです。これを実現するために、階層的な高レベルのセグメントを生成する時間階層ネットワークを設計します。次に、階層セグメントフレームアテンション（SFA）モジュールを導入して、低レベルフレームと高レベルセグメント間の関係をキャプチャします。整合性の喪失を介してフレームとそれに対応するセグメントの予測を正規化することにより、ネットワークはセマンティック整合性のあるセグメントを生成し、あいまいな低レベルフレームによって引き起こされた誤分類された予測を修正できます。 2つの公開手術ビデオデータセット、つまりM2CAI16チャレンジデータセットとCholec80データセットでSAHCを検証します。実験結果は、私たちの方法が以前の最先端技術を大幅に上回り、特にM2CAI16で4.1％の改善に達することを示しています。コードは、承認されるとGitHubでリリースされます。

Automatic surgical phase recognition plays an important role in robot-assisted surgeries. Existing methods ignored a pivotal problem that surgical phases should be classified by learning segment-level semantics instead of solely relying on frame-wise information. In this paper, we present a segment-attentive hierarchical consistency network (SAHC) for surgical phase recognition from videos. The key idea is to extract hierarchical high-level semantic-consistent segments and use them to refine the erroneous predictions caused by ambiguous frames. To achieve it, we design a temporal hierarchical network to generate hierarchical high-level segments. Then, we introduce a hierarchical segment-frame attention (SFA) module to capture relations between the low-level frames and high-level segments. By regularizing the predictions of frames and their corresponding segments via a consistency loss, the network can generate semantic-consistent segments and then rectify the misclassified predictions caused by ambiguous low-level frames. We validate SAHC on two public surgical video datasets, i.e., the M2CAI16 challenge dataset and the Cholec80 dataset. Experimental results show that our method outperforms previous state-of-the-arts by a large margin, notably reaches 4.1% improvements on M2CAI16. Code will be released at GitHub upon acceptance.

updated: Mon Nov 22 2021 08:18:05 GMT+0000 (UTC)

published: Mon Nov 22 2021 08:18:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト