No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

S. Alireza Golestaneh; Saba Dadsetan; Kris M. Kitani

トランスフォーマー、相対ランキング、および自己整合性による参照なしの画質評価

非参照画質評価（NR-IQA）の目標は、主観的な評価に従って知覚画質を推定することです。これは、元の参照画像がないため、複雑で未解決の問題です。この論文では、畳み込みニューラルネットワーク（CNN）とTransformersの自己注意メカニズムの恩恵を受けるハイブリッドアプローチを活用して、入力画像からローカルと非ローカルの両方の特徴を抽出することにより、NR-IQAタスクに対処する新しいモデルを提案します。。 CNNを介して画像の局所構造情報をキャプチャし、抽出されたCNN特徴間の局所性バイアスを回避して画像の非局所表現を取得するために、抽出された特徴でTransformersを利用して、それらをトランスモデル。さらに、主観的スコアと客観的スコアの間の単調性相関を改善するために、各バッチ内の画像間の相対距離情報を利用して、それらの間の相対ランク付けを実施します。大事なことを言い忘れましたが、入力に同変変換（水平フリッピングなど）を適用すると、NR-IQAモデルのパフォーマンスが低下することがわかります。したがって、NRIQAモデルの堅牢性を向上させるために、自己監視のソースとして自己整合性を活用する方法を提案します。具体的には、各画像の品質評価モデルの出力とその変換（水平方向に反転）の間に自己整合性を適用して、豊富な自己監視情報を利用し、モデルの不確実性を低減します。私たちの仕事の有効性を実証するために、7つの標準IQAデータセット（合成と本物の両方）でそれを評価し、私たちのモデルがさまざまなデータセットで最先端の結果を達成することを示します。

The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations, it is a complex and unsolved problem due to the absence of the pristine reference image. In this paper, we propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers to extract both local and non-local features from the input image. We capture local structure information of the image via CNNs, then to circumvent the locality bias among the extracted CNNs features and obtain a non-local representation of the image, we utilize Transformers on the extracted features where we model them as a sequential input to the Transformer model. Furthermore, to improve the monotonicity correlation between the subjective and objective scores, we utilize the relative distance information among the images within each batch and enforce the relative ranking among them. Last but not least, we observe that the performance of NR-IQA models degrades when we apply equivariant transformations (e.g. horizontal flipping) to the inputs. Therefore, we propose a method that leverages self-consistency as a source of self-supervision to improve the robustness of NRIQA models. Specifically, we enforce self-consistency between the outputs of our quality assessment model for each image and its transformation (horizontally flipped) to utilize the rich self-supervisory information and reduce the uncertainty of the model. To demonstrate the effectiveness of our work, we evaluate it on seven standard IQA datasets (both synthetic and authentic) and show that our model achieves state-of-the-art results on various datasets.

updated: Mon Aug 16 2021 02:07:08 GMT+0000 (UTC)

published: Mon Aug 16 2021 02:07:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト