OctAttention: Octree-Based Large-Scale Contexts Model for Point Cloud Compression

Chunyang Fu; Ge Li; Rui Song; Wei Gao; Shan Liu

OctAttention：点群圧縮のためのOctreeベースの大規模コンテキストモデル

点群圧縮では、点群分布をモデル化するために十分なコンテキストが重要です。ただし、以前のボクセルベースの方法で収集されたコンテキストは、まばらな点群を処理するときに減少します。この問題に対処するために、点群のメモリ効率の高い表現であるoctree構造を採用したOctAttentionと呼ばれるマルチコンテキストディープラーニングフレームワークを提案します。私たちのアプローチは、兄弟ノードと祖先ノードの情報を収集することにより、損失のない方法で八分木シンボルシーケンスをエンコードします。明示的に、最初に点群をoctreeで表現して、空間の冗長性を減らします。これは、さまざまな解像度の点群に対して堅牢です。次に、兄弟と祖先のコンテキストをモデル化して隣接ノード間の強い依存関係を活用し、コンテキスト内の相関ノードを強調する注意メカニズムを採用する、大きな受容野を持つ条件付きエントロピーモデルを設計します。さらに、トレーニングとテスト中にマスク操作を導入して、エンコード時間とパフォーマンスのトレードオフを行います。以前の最先端の作品と比較して、私たちのアプローチは、LiDARベンチマーク（SemanticKITTIなど）とオブジェクトポイントクラウドデータセット（MPEG 8i、MVUBなど）で10％〜35％のBDレートゲインを取得し、95を節約しますボクセルベースのベースラインと比較した％コーディング時間。コードはhttps://github.com/zb12138/OctAttentionで入手できます。

In point cloud compression, sufficient contexts are significant for modeling the point cloud distribution. However, the contexts gathered by the previous voxel-based methods decrease when handling sparse point clouds. To address this problem, we propose a multiple-contexts deep learning framework called OctAttention employing the octree structure, a memory-efficient representation for point clouds. Our approach encodes octree symbol sequences in a lossless way by gathering the information of sibling and ancestor nodes. Expressly, we first represent point clouds with octree to reduce spatial redundancy, which is robust for point clouds with different resolutions. We then design a conditional entropy model with a large receptive field that models the sibling and ancestor contexts to exploit the strong dependency among the neighboring nodes and employ an attention mechanism to emphasize the correlated nodes in the context. Furthermore, we introduce a mask operation during training and testing to make a trade-off between encoding time and performance. Compared to the previous state-of-the-art works, our approach obtains a 10%-35% BD-Rate gain on the LiDAR benchmark (e.g. SemanticKITTI) and object point cloud dataset (e.g. MPEG 8i, MVUB), and saves 95% coding time compared to the voxel-based baseline. The code is available at https://github.com/zb12138/OctAttention.

updated: Sun May 08 2022 02:40:28 GMT+0000 (UTC)

published: Sat Feb 12 2022 10:06:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト