Unsupervised Anomaly Detection in Medical Images with a Memory-augmented Multi-level Cross-attentional Masked Autoencoder

Yu Tian; Guansong Pang; Yuyuan Liu; Chong Wang; Yuanhong Chen; Fengbei Liu; Rajvinder Singh; Johan W Verjans; Mengyu Wang; Gustavo Carneiro

メモリ拡張型マルチレベルクロスアテンションマスクオートエンコーダによる医療画像の教師なし異常検出

教師なし異常検出 (UAD) は、正常な画像のみを含むトレーニングセットを使用して検出器を最適化することで、異常な画像を見つけることを目的としています。 UAD アプローチは、再構成手法、自己教師ありアプローチ、および Imagenet の事前トレーニング済みモデルに基づくことができます。画像再構成エラーから異常を検出する再構成方法は、自己教師ありアプローチに必要な問題固有の口実タスクの設計や、非医療データセットから事前にトレーニングされたモデルの信頼性の低い変換に依存しないため、有利です。ただし、異常な画像であっても再構成エラーが低い可能性があるため、再構成方法は失敗する可能性があります。この論文では、異常な画像に対するこの低再構成エラーの問題に対処する、新しい再構成ベースの UAD アプローチを紹介します。当社の UAD アプローチであるメモリ拡張マルチレベルクロスアテンションマスクオートエンコーダ (MemMC-MAE) は、エンコーダ用の新しいメモリ拡張セルフアテンションオペレータと新しいマルチレベルクロスで構成されるトランスフォーマベースのアプローチです。デコーダの -attention 演算子。 MemMCMAE は、再構築中に入力画像の大部分をマスクし、異常がマスクされて再構築できない可能性が高いため、低再構築エラーが発生するリスクを軽減します。ただし、異常がマスクされていない場合、エンコーダのメモリに保存されている正常なパターンとデコーダのマルチレベルのクロスアテンションが組み合わされて、異常の正確な再構成が制限されます。私たちの方法が結腸内視鏡検査、肺炎、および covid-19 胸部 X 線データセットで SOTA 異常の検出と位置特定を達成することを示します。

Unsupervised anomaly detection (UAD) aims to find anomalous images by optimising a detector using a training set that contains only normal images. UAD approaches can be based on reconstruction methods, self-supervised approaches, and Imagenet pre-trained models. Reconstruction methods, which detect anomalies from image reconstruction errors, are advantageous because they do not rely on the design of problem-specific pretext tasks needed by self-supervised approaches, and on the unreliable translation of models pre-trained from non-medical datasets. However, reconstruction methods may fail because they can have low reconstruction errors even for anomalous images. In this paper, we introduce a new reconstruction-based UAD approach that addresses this low-reconstruction error issue for anomalous images. Our UAD approach, the memory-augmented multi-level cross-attentional masked autoencoder (MemMC-MAE), is a transformer-based approach, consisting of a novel memory-augmented self-attention operator for the encoder and a new multi-level cross-attention operator for the decoder. MemMCMAE masks large parts of the input image during its reconstruction, reducing the risk that it will produce low reconstruction errors because anomalies are likely to be masked and cannot be reconstructed. However, when the anomaly is not masked, then the normal patterns stored in the encoder's memory combined with the decoder's multi-level cross attention will constrain the accurate reconstruction of the anomaly. We show that our method achieves SOTA anomaly detection and localisation on colonoscopy, pneumonia, and covid-19 chest x-ray datasets.

updated: Tue Aug 22 2023 02:16:37 GMT+0000 (UTC)

published: Tue Mar 22 2022 13:32:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト