Applying VertexShuffle Toward 360-Degree Video Super-Resolution on Focused-Icosahedral-Mesh

Na Li; Yao Liu

焦点を合わせた二十面体メッシュでの360度ビデオ超解像に向けたVertexShuffleの適用

360度の画像/ビデオ、拡張現実（AR）、仮想現実（VR）の出現により、球面信号の分析と処理に対する需要が大幅に増加しています。ただし、球面信号から投影される平面信号には多くの労力が費やされ、ピクセルの浪費、歪みなどの問題が発生しました。球面CNNの最近の進歩により、球面信号を直接分析する可能性が開かれました。ただし、帯域幅の要件が非常に大きいため、実際のアプリケーションの状況に対処できないフルメッシュに注意を払っています。 360度ビデオストリーミングに関連する帯域幅の浪費の問題に対処し、計算を節約するために、フォーカスされたIcosahedral Meshを利用して小さな領域を表し、球形のコンテンツをフォーカスされたメッシュ領域に回転させるマトリックスを構築します。また、UGSCNNで導入された元のMeshConv Transpose操作と比較して、パフォーマンスと効率の両方を大幅に向上させることができる新しいVertexShuffle操作を提案しました。さらに、提案した方法を超解像モデルに適用します。これは、360度データの球形ピクセルのメッシュ表現を直接操作する球形超解像モデルを最初に提案したものです。モデルを評価するために、高解像度の360度ビデオのセットを収集して、球形の画像データセットを生成します。私たちの実験は、提案された球形超解像モデルが、単純なMeshConv転置操作を使用するベースライン球形超解像モデルと比較して、パフォーマンスと推論時間の両方の点で大きな利点を達成することを示しています。要約すると、私たちのモデルは360度入力で優れた超解像性能を達成し、メッシュ上の16x頂点を超解像するときに平均32.79 dBPSNRを達成します。

With the emerging of 360-degree image/video, augmented reality (AR) and virtual reality (VR), the demand for analysing and processing spherical signals get tremendous increase. However, plenty of effort paid on planar signals that projected from spherical signals, which leading to some problems, e.g. waste of pixels, distortion. Recent advances in spherical CNN have opened up the possibility of directly analysing spherical signals. However, they pay attention to the full mesh which makes it infeasible to deal with situations in real-world application due to the extremely large bandwidth requirement. To address the bandwidth waste problem associated with 360-degree video streaming and save computation, we exploit Focused Icosahedral Mesh to represent a small area and construct matrices to rotate spherical content to the focused mesh area. We also proposed a novel VertexShuffle operation that can significantly improve both the performance and the efficiency compared to the original MeshConv Transpose operation introduced in UGSCNN. We further apply our proposed methods on super resolution model, which is the first to propose a spherical super-resolution model that directly operates on a mesh representation of spherical pixels of 360-degree data. To evaluate our model, we also collect a set of high-resolution 360-degree videos to generate a spherical image dataset. Our experiments indicate that our proposed spherical super-resolution model achieves significant benefits in terms of both performance and inference time compared to the baseline spherical super-resolution model that uses the simple MeshConv Transpose operation. In summary, our model achieves great super-resolution performance on 360-degree inputs, achieving 32.79 dB PSNR on average when super-resoluting 16x vertices on the mesh.

updated: Mon Jun 21 2021 16:53:57 GMT+0000 (UTC)

published: Mon Jun 21 2021 16:53:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト