Sparse Tensor-based Multiscale Representation for Point Cloud Geometry Compression

Jianqiang Wang; Dandan Ding; Zhu Li; Xiaoxing Feng; Chuntong Cao; Zhan Ma

点群ジオメトリ圧縮のためのスパーステンソルベースのマルチスケール表現

この研究では、マルチスケールスパーステンソルベースのボクセル化 PCG の処理を通じて、統一された Point Cloud Geometry (PCG) 圧縮方法を開発します。この圧縮方法を SparsePCGC と呼びます。提案された SparsePCGC は、まばらに分布した最確正占有ボクセル (MP-POV) に対してのみ畳み込みを実行するため、複雑度の低いソリューションです。マルチスケール表現により、クロススケールおよび同一スケールの相関関係を広範かつ柔軟に活用することで、スケールごとの MP-POV を圧縮することもできます。全体的な圧縮効率は、各 MP-POV の推定占有確率の精度に大きく依存します。したがって、最初に、疎な畳み込みとボクセルサンプリングを積み重ねて、空間相関を最もよく特徴付けて埋め込むSparse Convolutionベースのニューラルネットワーク（SparseCNN）を設計します。次に、SparseCNN ベースの占有確率近似 (SOPA) モデルを開発して、クロススケール相関のみを使用する単一段階の方法、または同じ間の段階ごとの相関を利用することによる多段階の方法のいずれかで占有確率を推定します。隣人をスケーリングします。さらに、SOPA を改善するために、SparseCNN ベースの Local Neighborhood Embedding (SLNE) を使用して、ローカルバリエーションをフィーチャ属性の空間事前確率として集約することもお勧めします。私たちの統一されたアプローチは、高密度オブジェクト PCG (8iVFB、Owlii、MUVB) とスパース LiDAR PCG (KITTI、Ford) を含むさまざまなデータセットにわたって、ロスレスおよびロッシー圧縮モードの両方で最先端のパフォーマンスを示すだけではありません。標準化された MPEG G-PCC およびその他の一般的な学習ベースのスキームですが、複雑さが低く、実用的なアプリケーションにとって魅力的です。

This study develops a unified Point Cloud Geometry (PCG) compression method through the processing of multiscale sparse tensor-based voxelized PCG. We call this compression method SparsePCGC. The proposed SparsePCGC is a low complexity solution because it only performs the convolutions on sparsely-distributed Most-Probable Positively-Occupied Voxels (MP-POV). The multiscale representation also allows us to compress scale-wise MP-POVs by exploiting cross-scale and same-scale correlations extensively and flexibly. The overall compression efficiency highly depends on the accuracy of estimated occupancy probability for each MP-POV. Thus, we first design the Sparse Convolution-based Neural Network (SparseCNN) which stacks sparse convolutions and voxel sampling to best characterize and embed spatial correlations. We then develop the SparseCNN-based Occupancy Probability Approximation (SOPA) model to estimate the occupancy probability either in a single-stage manner only using the cross-scale correlation, or in a multi-stage manner by exploiting stage-wise correlation among same-scale neighbors. Besides, we also suggest the SparseCNN based Local Neighborhood Embedding (SLNE) to aggregate local variations as spatial priors in feature attribute to improve the SOPA. Our unified approach not only shows state-of-the-art performance in both lossless and lossy compression modes across a variety of datasets including the dense object PCGs (8iVFB, Owlii, MUVB) and sparse LiDAR PCGs (KITTI, Ford) when compared with standardized MPEG G-PCC and other prevalent learning-based schemes, but also has low complexity which is attractive to practical applications.

updated: Fri Oct 21 2022 14:12:13 GMT+0000 (UTC)

published: Sat Nov 20 2021 17:02:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト