A Hierarchical Coding Scheme for Glasses-free 3D Displays Based on Scalable Hybrid Layered Representation of Real-World Light Fields

Joshitha R; Mansi Sharma

実世界のライトフィールドのスケーラブルなハイブリッドレイヤード表現に基づくメガネなしの3Dディスプレイの階層的コーディングスキーム

この論文は、低ランクの乗法層とフーリエ視差層の透過率パターンに基づくライトフィールドの新しい階層的コーディングスキームを提示します。提案されたスキームは、異なるスキャン順序から決定されたライトフィールドビューのサブセットからスタックされた乗法層を学習します。乗法層は、高速データ駆動型畳み込みニューラルネットワーク（CNN）を使用して最適化されます。層パターンの空間相関は、クリロフ部分空間での特異値分解から導出された因数分解のさまざまな低ランクで利用されます。さらに、HEVCでエンコードすると、低ランクの近似レイヤーのビュー内およびビュー間の相関が効率的に削除されます。乗法表現から近似されたデコードされたビューの最初のサブセットは、フーリエ視差層（FDL）表現を構築するために使用されます。 FDLモデルは、事前定義された階層予測順序によって識別されるビューの2番目のサブセットを合成します。合成されたビューの予測残差間の相関は、残余信号をエンコードすることによってさらに排除されます。残差のデコードから取得されたビューのセットは、FDLモデルを改良し、ビューの次のサブセットを精度を向上させて予測するために使用されます。この階層手順は、すべてのライトフィールドビューがエンコードされるまで繰り返されます。提案されたハイブリッドレイヤード表現およびコーディングスキームの重要な利点は、空間的および時間的冗長性を利用するだけでなく、異なる予測順序で指定された水平方向と垂直方向の両方で隣接するサブアパーチャ画像間の強い固有の類似性を効率的に活用することです。さらに、この方式は、単一の統合システム内のデコーダーで複数のビットレートの範囲を実現するために柔軟です。実際のライトフィールドで分析された圧縮パフォーマンスは、ビットレートの大幅な節約を示し、良好な再構成品質を維持します。

This paper presents a novel hierarchical coding scheme for light fields based on transmittance patterns of low-rank multiplicative layers and Fourier disparity layers. The proposed scheme learns stacked multiplicative layers from subsets of light field views determined from different scanning orders. The multiplicative layers are optimized using a fast data-driven convolutional neural network (CNN). The spatial correlation in layer patterns is exploited with varying low ranks in factorization derived from singular value decomposition on a Krylov subspace. Further, encoding with HEVC efficiently removes intra-view and inter-view correlation in low-rank approximated layers. The initial subset of approximated decoded views from multiplicative representation is used to construct Fourier disparity layer (FDL) representation. The FDL model synthesizes second subset of views which is identified by a pre-defined hierarchical prediction order. The correlations between the prediction residue of synthesized views is further eliminated by encoding the residual signal. The set of views obtained from decoding the residual is employed in order to refine the FDL model and predict the next subset of views with improved accuracy. This hierarchical procedure is repeated until all light field views are encoded. The critical advantage of proposed hybrid layered representation and coding scheme is that it utilizes not just spatial and temporal redundancies, but efficiently exploits the strong intrinsic similarities among neighboring sub-aperture images in both horizontal and vertical directions as specified by different predication orders. Besides, the scheme is flexible to realize a range of multiple bitrates at the decoder within a single integrated system. The compression performance analyzed with real light field shows substantial bitrate savings, maintaining good reconstruction quality.

updated: Mon Apr 19 2021 15:09:21 GMT+0000 (UTC)

published: Mon Apr 19 2021 15:09:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト