Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition

Cindy M. Nguyen; Eric R. Chan; Alexander W. Bergman; Gordon Wetzstein

暗闇での拡散: 低照度テキスト認識のための拡散モデル

テキスト認識などの高度なタスクの自動化には、画像が不可欠です。低照度条件は、これらの高レベルの知覚スタックに課題をもたらします。これは、多くの場合、明るい、アーティファクトのない画像で最適化されます。低照度画像の再構成方法は、明るい画像を生成できますが、通常、ダウンストリームタスクにとって重要な高頻度の詳細が犠牲になります。非常にノイズの多い暗い条件でも高周波の詳細を維持しながら、SOTA と定性的に競合する再構成を提供する低光量画像再構成用の拡散モデルである Diffusion in the Dark (DiD) を提案します。タスク固有の最適化なしで、DiD が実際の画像での低照度テキスト認識において SOTA 低照度法よりも優れていることを実証し、不適切な設定の逆問題に対する拡散モデルの可能性を強化します。

Images are indispensable for the automation of high-level tasks, such as text recognition. Low-light conditions pose a challenge for these high-level perception stacks, which are often optimized on well-lit, artifact-free images. Reconstruction methods for low-light images can produce well-lit counterparts, but typically at the cost of high-frequency details critical for downstream tasks. We propose Diffusion in the Dark (DiD), a diffusion model for low-light image reconstruction that provides qualitatively competitive reconstructions with that of SOTA, while preserving high-frequency details even in extremely noisy, dark conditions. We demonstrate that DiD, without any task-specific optimization, can outperform SOTA low-light methods in low-light text recognition on real images, bolstering the potential of diffusion models for ill-posed inverse problems.

updated: Tue Mar 07 2023 23:52:51 GMT+0000 (UTC)

published: Tue Mar 07 2023 23:52:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト