DeepCompress: Efficient Point Cloud Geometry Compression

Ryan Killea; Yun Li; Saeed Bastani; Paul McLachlan

DeepCompress: 効率的な点群ジオメトリ圧縮

点群は、3D コンテンツがよりユビキタスになるにつれて、ますます注目される基本的なデータタイプです。点群を使用するアプリケーションには、仮想現実、拡張現実、複合現実、および自動運転が含まれます。確立された 3D オブジェクト検出および画像圧縮アーキテクチャの原理を組み込んだ、ポイントクラウド圧縮用のより効率的なディープラーニングベースのエンコーダアーキテクチャを提案します。アブレーション研究を通じて、Computational Efficient Neural Image Compression (CENIC) から学習した活性化関数を組み込み、よりパラメーター効率の高い畳み込みブロックを設計すると、効率とパフォーマンスが劇的に向上することを示しています。提案されたアーキテクチャは、一般化された分裂正規化のアクティベーションを組み込んでおり、空間的に分離可能な InceptionV4 に触発されたブロックを提案しています。次に、標準の JPEG Pleno 8i Voxelized Full Bodies データセットでレート歪み曲線を評価して、モデルのパフォーマンスを評価します。提案された変更は、Bjontegard デルタレートと PSNR 値の点でベースラインアプローチをわずかに上回っていますが、必要なエンコーダーの畳み込み演算を 8% 削減し、エンコーダーパラメーターの合計を 20% 削減します。提案されたアーキテクチャを単独で検討すると、同じピーク信号対雑音比に対して、面取り距離で 0.02 パーセントの小さなペナルティがあり、ポイント間距離でビットレートが 0.32 パーセント増加します。

Point clouds are a basic data type that is increasingly of interest as 3D content becomes more ubiquitous. Applications using point clouds include virtual, augmented, and mixed reality and autonomous driving. We propose a more efficient deep learning-based encoder architecture for point clouds compression that incorporates principles from established 3D object detection and image compression architectures. Through an ablation study, we show that incorporating the learned activation function from Computational Efficient Neural Image Compression (CENIC) and designing more parameter-efficient convolutional blocks yields dramatic gains in efficiency and performance. Our proposed architecture incorporates Generalized Divisive Normalization activations and propose a spatially separable InceptionV4-inspired block. We then evaluate rate-distortion curves on the standard JPEG Pleno 8i Voxelized Full Bodies dataset to evaluate our model's performance. Our proposed modifications outperform the baseline approaches by a small margin in terms of Bjontegard delta rate and PSNR values, yet reduces necessary encoder convolution operations by 8 percent and reduces total encoder parameters by 20 percent. Our proposed architecture, when considered on its own, has a small penalty of 0.02 percent in Chamfer's Distance and 0.32 percent increased bit rate in Point to Plane Distance for the same peak signal-to-noise ratio.

updated: Wed Jun 02 2021 23:18:11 GMT+0000 (UTC)

published: Wed Jun 02 2021 23:18:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト