Retinal Image Restoration using Transformer and Cycle-Consistent Generative Adversarial Network

Alnur Alimanov; Md Baharul Islam

Transformer と Cycle-Consistent Generative Adversarial Network を使用した網膜像の修復

医用画像は、さまざまな病気の検出と治療において重要な役割を果たします。ただし、これらの画像は品質が低すぎることが多く、効率の低下、余分な費用、さらには誤った診断につながります。そこで、視覚変換器と畳み込みニューラルネットワークを使用した網膜画像強調法を提案します。対になっていないデータセットに依存するサイクル一貫性のある敵対的生成ネットワークを構築します。これは、画像をあるドメインから別のドメインに変換する (たとえば、低品質から高品質へ、またはその逆) 2 つのジェネレーターで構成され、2 つのディスクリミネーターで敵対的なゲームをプレイします。ジェネレーターは、生成されたイメージから元のイメージを予測するディスクリミネーターに対して区別できないイメージを生成します。ジェネレーターは、ビジョントランスフォーマー (ViT) エンコーダーと畳み込みニューラルネットワーク (CNN) デコーダーの組み合わせです。弁別器には、従来の CNN エンコーダーが含まれます。得られた改善された画像は、ピーク信号対雑音比 (PSNR)、構造類似性指標測定 (SSIM) などの評価指標を使用して定量的にテストされ、定性的に、つまり血管セグメンテーションがテストされました。提案された方法は、構造情報と色情報を大幅に保持しながら、ぼやけ、ノイズ、照明障害、および色の歪みの悪影響をうまく低減します。実験結果は,提案した方法の優位性を示している。テストした PSNR は、最初のデータセットで 31.138 dB、2 番目のデータセットで 27.798 dB です。テスト SSIM は、それぞれ 0.919 と 0.904 です。

Medical imaging plays a significant role in detecting and treating various diseases. However, these images often happen to be of too poor quality, leading to decreased efficiency, extra expenses, and even incorrect diagnoses. Therefore, we propose a retinal image enhancement method using a vision transformer and convolutional neural network. It builds a cycle-consistent generative adversarial network that relies on unpaired datasets. It consists of two generators that translate images from one domain to another (e.g., low- to high-quality and vice versa), playing an adversarial game with two discriminators. Generators produce indistinguishable images for discriminators that predict the original images from generated ones. Generators are a combination of vision transformer (ViT) encoder and convolutional neural network (CNN) decoder. Discriminators include traditional CNN encoders. The resulting improved images have been tested quantitatively using such evaluation metrics as peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and qualitatively, i.e., vessel segmentation. The proposed method successfully reduces the adverse effects of blurring, noise, illumination disturbances, and color distortions while significantly preserving structural and color information. Experimental results show the superiority of the proposed method. Our testing PSNR is 31.138 dB for the first and 27.798 dB for the second dataset. Testing SSIM is 0.919 and 0.904, respectively.

updated: Fri Mar 03 2023 14:10:47 GMT+0000 (UTC)

published: Fri Mar 03 2023 14:10:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト