DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning

Tongan Cai; Haomiao Ni; Mingli Yu; Xiaolei Huang; Kelvin Wong; John Volpi; James Z. Wang; Stephen T. C. Wong

DeepStroke：マルチモーダルな敵対的ディープラーニングを備えた緊急治療室向けの効率的な脳卒中スクリーニングフレームワーク

緊急治療室（ER）の設定では、脳卒中の診断は一般的な課題です。実行時間とコストが高すぎるため、MRIスキャンは通常ERでは利用できません。臨床検査は一般的に脳卒中スクリーニングで言及されますが、神経科医はすぐに利用できない場合があります。急性期の脳卒中の疑いのある患者の顔の動きの調整と発話不能のパターンを認識することにより、コンピューター支援の脳卒中の存在評価を達成するために、新しいマルチモーダル深層学習フレームワーク、DeepStrokeを提案します。私たちが提案するDeepStrokeは、局所的な顔面神経麻痺の検出用のビデオデータと、グローバルな言語障害分析用のオーディオデータを取得します。さらに、マルチモーダルラテラルフュージョンを活用して、低レベルと高レベルの機能を組み合わせ、共同トレーニングの相互正則化を提供します。アイデンティティに依存せず、脳卒中を区別する機能を取得するために、新しい敵対的トレーニングの喪失も導入されています。実際のER患者を対象としたビデオオーディオデータセットでの実験では、提案されたアプローチが最先端のモデルよりも優れており、ER医師よりも優れたパフォーマンスを達成し、特異性を調整したときに6.60％高い感度を達成し、4.62％高い精度を維持することが示されています。一方、各評価は6分未満で完了することができ、臨床実装のフレームワークの大きな可能性を示しています。

In an emergency room (ER) setting, the diagnosis of stroke is a common challenge. Due to excessive execution time and cost, an MRI scan is usually not available in the ER. Clinical tests are commonly referred to in stroke screening, but neurologists may not be immediately available. We propose a novel multimodal deep learning framework, DeepStroke, to achieve computer-aided stroke presence assessment by recognizing the patterns of facial motion incoordination and speech inability for patients with suspicion of stroke in an acute setting. Our proposed DeepStroke takes video data for local facial paralysis detection and audio data for global speech disorder analysis. It further leverages a multi-modal lateral fusion to combine the low- and high-level features and provides mutual regularization for joint training. A novel adversarial training loss is also introduced to obtain identity-independent and stroke-discriminative features. Experiments on our video-audio dataset with actual ER patients show that the proposed approach outperforms state-of-the-art models and achieves better performance than ER doctors, attaining a 6.60% higher sensitivity and maintaining 4.62% higher accuracy when specificity is aligned. Meanwhile, each assessment can be completed in less than 6 minutes, demonstrating the framework's great potential for clinical implementation.

updated: Fri Sep 24 2021 16:46:13 GMT+0000 (UTC)

published: Fri Sep 24 2021 16:46:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト