Exploring the Effectiveness of Video Perceptual Representation in Blind Video Quality Assessment

Liang Liao; Kangmin Xu; Haoning Wu; Chaofeng Chen; Wenxiu Sun; Qiong Yan; Weisi Lin

ブラインドビデオ品質評価におけるビデオ知覚表現の有効性の調査

非専門家によって撮影された野生のビデオの急速な成長に伴い、ブラインドビデオ品質評価（VQA）は挑戦的で要求の厳しい問題になっています。この問題を解決するために多くの努力が払われてきましたが、人間の視覚系（HVS）がビデオの時間的品質にどのように関係しているかは不明なままです。一方、最近の研究では、HVSの知覚領域に変換された自然なビデオのフレームが表現の直線的な軌跡を形成する傾向があることがわかりました。歪みが知覚されるビデオ品質を損ない、知覚表現の湾曲した軌道をもたらすという得られた洞察を用いて、表現のグラフィック形態を記述することによって時間的歪みを測定するための時間的知覚品質指数（TPQI）を提案します。具体的には、まず外側膝状体（LGN）とHVSの一次視覚野（V1）からビデオ知覚表現を抽出し、次にそれらの軌道の真直度とコンパクトさを測定して、ビデオの自然性とコンテンツの連続性の低下を定量化します。実験によると、HVSの知覚表現は主観的な時間的品質を予測する効果的な方法であり、したがってTPQIは初めて、空間的品質メトリックに匹敵するパフォーマンスを達成し、時間的変動が大きいビデオの評価にさらに効果的です。さらに、空間品質メトリックであるNIQEと組み合わせることにより、TPQIが一般的なインザワイルドビデオデータセットよりも優れたパフォーマンスを達成できることを示します。さらに重要なことに、TPQIは、評価対象のビデオ以外の追加情報を必要としないため、パラメーターを調整せずに任意のデータセットに適用できます。ソースコードはhttps://github.com/UoLMM/TPQI-VQAで入手できます。

With the rapid growth of in-the-wild videos taken by non-specialists, blind video quality assessment (VQA) has become a challenging and demanding problem. Although lots of efforts have been made to solve this problem, it remains unclear how the human visual system (HVS) relates to the temporal quality of videos. Meanwhile, recent work has found that the frames of natural video transformed into the perceptual domain of the HVS tend to form a straight trajectory of the representations. With the obtained insight that distortion impairs the perceived video quality and results in a curved trajectory of the perceptual representation, we propose a temporal perceptual quality index (TPQI) to measure the temporal distortion by describing the graphic morphology of the representation. Specifically, we first extract the video perceptual representations from the lateral geniculate nucleus (LGN) and primary visual area (V1) of the HVS, and then measure the straightness and compactness of their trajectories to quantify the degradation in naturalness and content continuity of video. Experiments show that the perceptual representation in the HVS is an effective way of predicting subjective temporal quality, and thus TPQI can, for the first time, achieve comparable performance to the spatial quality metric and be even more effective in assessing videos with large temporal variations. We further demonstrate that by combining with NIQE, a spatial quality metric, TPQI can achieve top performance over popular in-the-wild video datasets. More importantly, TPQI does not require any additional information beyond the video being evaluated and thus can be applied to any datasets without parameter tuning. Source code is available at https://github.com/UoLMM/TPQI-VQA.

updated: Fri Jul 08 2022 07:30:51 GMT+0000 (UTC)

published: Fri Jul 08 2022 07:30:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト