Treating Point Cloud as Moving Camera Videos: A No-Reference Quality Assessment Metric

Zicheng Zhang; Wei Sun; Yucheng Zhu; Xiongkuo Min; Wei Wu; Ying Chen; Guangtao Zhai

点群を動くカメラビデオとして扱う: 参考にならない品質評価指標

点群は、3 次元 (3D) コンテンツで最も広く使用されているデジタル表現形式の 1 つであり、その視覚的品質は、制作手順中のノイズや幾何学的なシフトの歪み、および伝送プロセス中の圧縮やダウンサンプリングの歪みの影響を受ける可能性があります。点群品質評価 (PCQA) の課題に取り組むために、レンダリングされた静的 2D 投影を評価することによって点群の視覚的品質レベルを評価する多くの PCQA メソッドが提案されています。このような投影ベースの PCQA 手法は、成熟した画質評価 (IQA) 手法の助けを借りて競争力のあるパフォーマンスを達成しますが、3D モデルが動的な表示方法でも認識されることを無視しています。レンダリングデバイス。したがって、このホワイトペーパーでは、ポイントクラウドを動くカメラビデオとして扱い、ビデオ品質評価 (VQA) メソッドを使用して PCQA タスクを処理する方法を探ります。まず、いくつかの円形の経路を介して点群の周りでカメラを回転させることにより、キャプチャされたビデオを生成します。次に、選択したキーフレームとビデオクリップから、トレーニング可能な 2D-CNN モデルと事前トレーニング済みの 3D-CNN モデルをそれぞれ使用して、空間的および時間的な品質認識機能の両方を抽出します。最後に、点群の視覚的品質は、ビデオ品質値によって表されます。実験結果は、提案された方法が点群の視覚的品質レベルを予測するのに効果的であり、完全参照 (FR) PCQA 方法と競合することさえあることを明らかにしています。アブレーション研究は、提案されたフレームワークの合理性をさらに検証し、動的な表示方法を介して抽出された品質認識機能によって行われた貢献を確認します。

Point cloud is one of the most widely used digital representation formats for three-dimensional (3D) contents, the visual quality of which may suffer from noise and geometric shift distortions during the production procedure as well as compression and downsampling distortions during the transmission process. To tackle the challenge of point cloud quality assessment (PCQA), many PCQA methods have been proposed to evaluate the visual quality levels of point clouds by assessing the rendered static 2D projections. Although such projection-based PCQA methods achieve competitive performance with the assistance of mature image quality assessment (IQA) methods, they neglect that the 3D model is also perceived in a dynamic viewing manner, where the viewpoint is continually changed according to the feedback of the rendering device. Therefore, in this paper, we treat the point clouds as moving camera videos and explore the way of dealing with PCQA tasks via using video quality assessment (VQA) methods. First, we generate the captured videos by rotating the camera around the point clouds through several circular pathways. Then we extract both spatial and temporal quality-aware features from the selected key frames and the video clips through using trainable 2D-CNN and pre-trained 3D-CNN models respectively. Finally, the visual quality of point clouds is represented by the video quality values. The experimental results reveal that the proposed method is effective for predicting the visual quality levels of the point clouds and even competitive with full-reference (FR) PCQA methods. The ablation studies further verify the rationality of the proposed framework and confirm the contributions made by the quality-aware features extracted via the dynamic viewing manner.

updated: Sun Sep 11 2022 05:48:52 GMT+0000 (UTC)

published: Tue Aug 30 2022 08:59:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト