SnakeVoxFormer: Transformer-based Single Image\\Voxel Reconstruction with Run Length Encoding

Jae Joong Lee; Bedrich Benes

SnakeVoxFormer: ランレングスエンコーディングによるトランスフォーマーベースの単一画像\\ボクセル再構成

深層学習ベースの 3D オブジェクト再構成は、前例のない結果を達成しました。それらの中で、Transformer ディープニューラルモデルは、コンピュータービジョンの多くのアプリケーションで優れたパフォーマンスを示しました。トランスフォーマーを使用して単一の画像からボクセル空間で新しい 3D オブジェクトを再構成する SnakeVoxFormer を紹介します。 SnakeVoxFormer への入力は 2D 画像で、結果は 3D ボクセルモデルです。私たちのアプローチの重要な目新しさは、ボクセル空間を (ヘビのように) トラバースし、広い空間差を変換エンコードに適した 1D 構造にエンコードするランレングスエンコードを使用することです。次に、ディクショナリエンコーディングを使用して、検出された RLE ブロックをトランスフォーマーに使用されるトークンに変換します。 1D 表現は、元のデータサイズの約 1% しか使用しない 1D データに変換するロスレス 3D 形状データ圧縮方法です。さまざまなボクセルトラバース戦略がエンコードと再構成の効果にどのように影響するかを示します。私たちの方法を画像からの 3D ボクセル再構成の最先端技術と比較すると、私たちの方法は最先端の方法を少なくとも 2.8%、最大 19.8% 改善します。

Deep learning-based 3D object reconstruction has achieved unprecedented results. Among those, the transformer deep neural model showed outstanding performance in many applications of computer vision. We introduce SnakeVoxFormer, a novel, 3D object reconstruction in voxel space from a single image using the transformer. The input to SnakeVoxFormer is a 2D image, and the result is a 3D voxel model. The key novelty of our approach is in using the run-length encoding that traverses (like a snake) the voxel space and encodes wide spatial differences into a 1D structure that is suitable for transformer encoding. We then use dictionary encoding to convert the discovered RLE blocks into tokens that are used for the transformer. The 1D representation is a lossless 3D shape data compression method that converts to 1D data that use only about 1% of the original data size. We show how different voxel traversing strategies affect the effect of encoding and reconstruction. We compare our method with the state-of-the-art for 3D voxel reconstruction from images and our method improves the state-of-the-art methods by at least 2.8% and up to 19.8%.

updated: Tue Mar 28 2023 20:16:13 GMT+0000 (UTC)

published: Tue Mar 28 2023 20:16:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト