FUSQA: Fetal Ultrasound Segmentation Quality Assessment

Sevim Cengiz; Ibrahim Almakk; Mohammad Yaqub

FUSQA: 胎児超音波セグメンテーション品質評価

ディープラーニングモデルは、さまざまな胎児の超音波セグメンテーションタスクに効果的です。ただし、目に見えない新しいデータへの一般化は、臨床採用の有効性について疑問を投げかけています。通常、目に見えない新しいデータへの移行には、移行後のセグメンテーションパフォーマンスを検証するための、時間と費用のかかる品質保証プロセスが必要です。セグメンテーション品質評価の取り組みは、自然画像に焦点を当ててきました。この問題は通常、サイコロスコア回帰タスクとして定式化されています。この論文では、比較するマスクが存在しない場合のセグメンテーション品質評価に取り組むために、単純化された胎児超音波セグメンテーション品質評価 (FUSQA) モデルを提案します。セグメンテーション品質評価プロセスを自動化された分類タスクとして定式化して、より正確な妊娠年齢推定のために、品質の良いセグメンテーションマスクと質の悪いセグメンテーションマスクを区別します。異なる超音波装置を使用して 2 つの病院から収集した 2 つのデータセットで、提案したアプローチのパフォーマンスを検証します。さまざまなアーキテクチャを比較し、目に見えないデータセットから高品質のセグメンテーションマスクと低品質のセグメンテーションマスクを区別する際に 90% を超える分類精度を達成する最高パフォーマンスのアーキテクチャを使用します。さらに、医師によって報告された妊娠期間と、適切にセグメント化されたマスクを使用した CRL 測定に基づいて推定された妊娠期間の差は 1.45 日しかありませんでした。一方、セグメント化が不十分なマスクから CRL を計算すると、この差はさらに大きくなり、最大 7.73 日になりました。その結果、AI ベースのアプローチは、胎児の超音波セグメンテーションの品質評価に役立つ可能性があり、将来的にリアルタイムスクリーニングで不適切なセグメンテーションを検出する可能性があります。

Deep learning models have been effective for various fetal ultrasound segmentation tasks. However, generalization to new unseen data has raised questions about their effectiveness for clinical adoption. Normally, a transition to new unseen data requires time-consuming and costly quality assurance processes to validate the segmentation performance post-transition. Segmentation quality assessment efforts have focused on natural images, where the problem has been typically formulated as a dice score regression task. In this paper, we propose a simplified Fetal Ultrasound Segmentation Quality Assessment (FUSQA) model to tackle the segmentation quality assessment when no masks exist to compare with. We formulate the segmentation quality assessment process as an automated classification task to distinguish between good and poor-quality segmentation masks for more accurate gestational age estimation. We validate the performance of our proposed approach on two datasets we collect from two hospitals using different ultrasound machines. We compare different architectures, with our best-performing architecture achieving over 90% classification accuracy on distinguishing between good and poor-quality segmentation masks from an unseen dataset. Additionally, there was only a 1.45-day difference between the gestational age reported by doctors and estimated based on CRL measurements using well-segmented masks. On the other hand, this difference increased and reached up to 7.73 days when we calculated CRL from the poorly segmented masks. As a result, AI-based approaches can potentially aid fetal ultrasound segmentation quality assessment and might detect poor segmentation in real-time screening in the future.

updated: Wed Mar 08 2023 07:45:06 GMT+0000 (UTC)

published: Wed Mar 08 2023 07:45:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト