Deep Learning based Full-reference and No-reference Quality Assessment Models for Compressed UGC Videos

Wei Sun; Tao Wang; Xiongkuo Min; Fuwang Yi; Guangtao Zhai

圧縮された UGC ビデオの深層学習ベースの完全参照および非参照品質評価モデル

この論文では、圧縮されたユーザー生成コンテンツ (UGC) ビデオの品質を評価するための深層学習ベースのビデオ品質評価 (VQA) フレームワークを提案します。提案された VQA フレームワークは、特徴抽出モジュール、品質回帰モジュール、および品質プーリングモジュールの 3 つのモジュールで構成されます。特徴抽出モジュールでは、畳み込みニューラルネットワーク (CNN) ネットワークの中間層からの特徴を最終的な品質を意識した特徴表現に融合します。これにより、モデルは低レベルから高レベルまで視覚情報を最大限に活用できます。具体的には、すべての中間層から抽出された特徴マップの構造とテクスチャの類似性は、完全参照 (FR) VQA モデルの特徴表現として計算され、中間特徴マップによって融合された最終的な特徴マップのグローバル平均と標準偏差が計算されます。参照なし (NR) VQA モデルの特徴表現として。品質回帰モジュールでは、全結合 (FC) 層を使用して、品質を意識した特徴をフレームレベルのスコアに回帰します。最後に、主観に基づいた時間的プーリング戦略を採用して、フレームレベルのスコアをビデオレベルのスコアにプールします。提案されたモデルは、圧縮された UGC VQA データベースで最先端の FR および NR VQA モデルの中で最高のパフォーマンスを達成し、また、野生の UGC VQA データベースでもかなり良いパフォーマンスを実現します。

In this paper, we propose a deep learning based video quality assessment (VQA) framework to evaluate the quality of the compressed user's generated content (UGC) videos. The proposed VQA framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module. For the feature extraction module, we fuse the features from intermediate layers of the convolutional neural network (CNN) network into final quality-aware feature representation, which enables the model to make full use of visual information from low-level to high-level. Specifically, the structure and texture similarities of feature maps extracted from all intermediate layers are calculated as the feature representation for the full reference (FR) VQA model, and the global mean and standard deviation of the final feature maps fused by intermediate feature maps are calculated as the feature representation for the no reference (NR) VQA model. For the quality regression module, we use the fully connected (FC) layer to regress the quality-aware features into frame-level scores. Finally, a subjectively-inspired temporal pooling strategy is adopted to pool frame-level scores into the video-level score. The proposed model achieves the best performance among the state-of-the-art FR and NR VQA models on the Compressed UGC VQA database and also achieves pretty good performance on the in-the-wild UGC VQA databases.

updated: Wed Jun 02 2021 12:23:16 GMT+0000 (UTC)

published: Wed Jun 02 2021 12:23:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト