Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views

Zi-Xin Zou; Weihao Cheng; Yan-Pei Cao; Shi-Sheng Huang; Ying Shan; Song-Hai Zhang

Sparse3D: スパースビューからオブジェクトを再構築するためのマルチビュー一貫性のある拡散の蒸留

非常にまばらなビューから 3D オブジェクトを再構成することは、長年にわたる困難な問題です。最近の技術では、画像拡散モデルを使用して、新しい視点でもっともらしい画像を生成したり、スコア抽出サンプリング (SDS) を使用して事前にトレーニングされた拡散事前分布を 3D 表現に抽出したりしていますが、これらの方法は、多くの場合、高品質で一貫性のある詳細な結果を同時に達成するのに苦労しています。ノベルビュー合成 (NVS) とジオメトリの両方。この研究では、スパースビュー入力に合わせた新しい 3D 再構成手法である Sparse3D を紹介します。私たちのアプローチは、マルチビュー一貫性のある拡散モデルから堅牢な事前分布を抽出して、神経放射フィールドを洗練します。具体的には、入力ビューからのエピポーラ特徴を利用するコントローラーを採用し、安定拡散などの事前トレーニングされた拡散モデルをガイドして、入力との 3D 一貫性を維持する新しいビュー画像を生成します。強力な画像拡散モデルから 2D 事前分布を利用することにより、当社の統合モデルは、オープンワールドのオブジェクトに直面した場合でも、一貫して高品質の結果を提供します。従来の SDS によってもたらされた不鮮明さに対処するために、カテゴリスコア蒸留サンプリング (C-SDS) を導入して詳細を強化します。実世界の物体の多視点データセットであるCO3DV2を使った実験を行っています。定量的評価と定性的評価の両方で、NVS とジオメトリ再構築に関するメトリクスに関して、私たちのアプローチが以前の最先端の研究よりも優れていることが実証されています。

Reconstructing 3D objects from extremely sparse views is a long-standing and challenging problem. While recent techniques employ image diffusion models for generating plausible images at novel viewpoints or for distilling pre-trained diffusion priors into 3D representations using score distillation sampling (SDS), these methods often struggle to simultaneously achieve high-quality, consistent, and detailed results for both novel-view synthesis (NVS) and geometry. In this work, we present Sparse3D, a novel 3D reconstruction method tailored for sparse view inputs. Our approach distills robust priors from a multiview-consistent diffusion model to refine a neural radiance field. Specifically, we employ a controller that harnesses epipolar features from input views, guiding a pre-trained diffusion model, such as Stable Diffusion, to produce novel-view images that maintain 3D consistency with the input. By tapping into 2D priors from powerful image diffusion models, our integrated model consistently delivers high-quality results, even when faced with open-world objects. To address the blurriness introduced by conventional SDS, we introduce the category-score distillation sampling (C-SDS) to enhance detail. We conduct experiments on CO3DV2 which is a multi-view dataset of real-world objects. Both quantitative and qualitative evaluations demonstrate that our approach outperforms previous state-of-the-art works on the metrics regarding NVS and geometry reconstruction.

updated: Sun Aug 27 2023 11:52:00 GMT+0000 (UTC)

published: Sun Aug 27 2023 11:52:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト