Unsupervised HDR Image and Video Tone Mapping via Contrastive Learning

Cong Cao; Huanjing Yue; Xin Liu; Jingyu Yang

対照学習による教師なし HDR 画像とビデオトーンマッピング

ハイダイナミックレンジ (HDR) 画像 (ビデオ) のキャプチャは、暗い領域と明るい領域の両方で詳細を明らかにできるため、魅力的です。主流のスクリーンはローダイナミックレンジ (LDR) コンテンツのみをサポートするため、HDR 画像 (ビデオ) のダイナミックレンジを圧縮するには、トーンマッピングアルゴリズムが必要です。イメージトーンマッピングは広く調査されていますが、ビデオトーンマッピングは、HDR-LDR ビデオペアがないため、特にディープラーニングベースの方法では遅れをとっています。この作業では、教師なしの画像とビデオのトーンマッピングのための統合フレームワーク (IVTMNet) を提案します。教師なしトレーニングを改善するために、ドメインおよびインスタンスベースの対照的な学習損失を提案します。 VGG などの汎用特徴抽出器を使用して類似性測定の特徴を抽出する代わりに、抽出された特徴の明るさとコントラストの集合である新しい潜在コードを提案して、異なるペアの類似性を測定します。トーンマッピングされた結果の潜在コードを制約するために、2 つの負のペアと 3 つの正のペアを完全に構築します。ビデオトーンマッピングの場合、時間的特徴置換 (TFR) モジュールを提案して、時間相関を効率的に利用し、ビデオトーンマッピング結果の時間的一貫性を向上させます。ビデオトーンマッピングの教師なしトレーニングプロセスを容易にするために、大規模なペアになっていない HDR-LDR ビデオデータセットを構築します。実験結果は、私たちの方法が最先端の画像およびビデオトーンマッピング方法よりも優れていることを示しています。私たちのコードとデータセットは、この作業が承認された後にリリースされます。

Capturing high dynamic range (HDR) images (videos) is attractive because it can reveal the details in both dark and bright regions. Since the mainstream screens only support low dynamic range (LDR) content, tone mapping algorithm is required to compress the dynamic range of HDR images (videos). Although image tone mapping has been widely explored, video tone mapping is lagging behind, especially for the deep-learning-based methods, due to the lack of HDR-LDR video pairs. In this work, we propose a unified framework (IVTMNet) for unsupervised image and video tone mapping. To improve unsupervised training, we propose domain and instance based contrastive learning loss. Instead of using a universal feature extractor, such as VGG to extract the features for similarity measurement, we propose a novel latent code, which is an aggregation of the brightness and contrast of extracted features, to measure the similarity of different pairs. We totally construct two negative pairs and three positive pairs to constrain the latent codes of tone mapped results. For video tone mapping, we propose a temporal-feature-replaced (TFR) module to efficiently utilize the temporal correlation and improve the temporal consistency of video tone-mapped results. We construct a large-scale unpaired HDR-LDR video dataset to facilitate the unsupervised training process for video tone mapping. Experimental results demonstrate that our method outperforms state-of-the-art image and video tone mapping methods. Our code and dataset will be released after the acceptance of this work.

updated: Mon Mar 13 2023 17:45:39 GMT+0000 (UTC)

published: Mon Mar 13 2023 17:45:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト