Vision Transformer for Classification of Breast Ultrasound Images

Behnaz Gheflati; Hassan Rivaz

乳房超音波画像の分類のためのビジョントランスフォーマー

医療用超音波（US）イメージングは、その使いやすさ、低コスト、および安全性により、乳がんイメージングの主要なモダリティになっています。過去10年間で、畳み込みニューラルネットワーク（CNN）が視覚アプリケーションで選択される方法として登場し、米国の画像の自動分類において優れた可能性を示してきました。彼らの成功にもかかわらず、彼らの制限された局所受容野は、グローバルな文脈情報を学ぶ彼らの能力を制限します。最近、画像パッチ間の自己注意に基づくVision Transformer（ViT）設計は、CNNの代替となる大きな可能性を示しています。この研究では、初めて、ViTを利用して、さまざまな増強戦略を使用して米国の乳房画像を分類します。結果は、分類精度と曲線下面積（AUC）メトリックとして提供され、パフォーマンスは最先端のCNNと比較されます。結果は、ViTモデルが米国の乳房画像の分類においてCNNと同等またはそれ以上の効率を持っていることを示しています。

Medical ultrasound (US) imaging has become a prominent modality for breast cancer imaging due to its ease-of-use, low-cost and safety. In the past decade, convolutional neural networks (CNNs) have emerged as the method of choice in vision applications and have shown excellent potential in automatic classification of US images. Despite their success, their restricted local receptive field limits their ability to learn global context information. Recently, Vision Transformer (ViT) designs that are based on self-attention between image patches have shown great potential to be an alternative to CNNs. In this study, for the first time, we utilize ViT to classify breast US images using different augmentation strategies. The results are provided as classification accuracy and Area Under the Curve (AUC) metrics, and the performance is compared with the state-of-the-art CNNs. The results indicate that the ViT models have comparable efficiency with or even better than the CNNs in classification of US breast images.

updated: Wed Mar 16 2022 22:57:35 GMT+0000 (UTC)

published: Wed Oct 27 2021 19:33:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト