Generative Models Review about 3D Reconstruction

Paper Model Input Parameter/Pnum GPU
DiT-3D Diffusion Transformers Voxelized PC
PointFlow AE flow-based PointCloud 1.61M
FlowGAN GAN flow-based Single Image N = 2500 A40 45GB
BuilDiff Diffusion models Single Image 1024 to 4096 A40 45GB
CCD-3DR CDPM Single Image 8192 3090Ti 24GB
SG-GAN SG-GAN Single Image
HaP Diffusion+SMPL+DepthEstimation Single Image 10000 4x3090Ti

Review

Generative approach(Img2PC)

Network Framework

Image.png|555

Flow-based:
image.png|333
image.png|666

Loss

点云倒角距离 CD ↓
$\begin{aligned}\mathcal{L}_{CD}&=\sum_{y’\in Y’}min_{y\in Y}||y’-y||_2^2+\sum_{y\in Y}min_{y’\in Y’}||y-y’||_2^2,\end{aligned}$

推土距离 EMD (Earth Mover’s distance)↓
$\mathcal{L}_{EMD}=min_{\phi:Y\rightarrow Y^{\prime}}\sum_{x\in Y}||x-\phi(x)||_{2}$ , φ indicates a parameter of bijection.

Diffusion Models

DMV3D

DMV3D: Denoising Multi-View Diffusion Using 3D Large Reconstruction Mode (justimyhxu.github.io)

Diffusion Model + Triplane NeRF + Multi-view Image input
image.png|666

DiffuStereo

DiffuStereo Project Page (liuyebin.com)

多视图
DoubleField 粗网格估计 + Diffusion 生成高质量 Disparity Flow 和 Depth + ICP 配准(点云融合)

image.png|666

Human as Points(HaP)

yztang4/HaP (github.com)
Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images(arxiv.org)
Human as Points—— Explicit Point-based 3D Human Reconstruction from Single-view RGB Images.pdf (readpaper.com)

深度估计+SMPL 估计得到两个稀疏点云,输入进 Diffusion Model 进行精细化生成

image.png|666

HumanNorm

HumanNorm

image.png|666

BuilDiff

预计 2023.11 release
BuilDiff论文阅读笔记
weiyao1996/BuilDiff: BuilDiff: 3D Building Shape Generation using Single-Image Conditional Point Cloud Diffusion Models (github.com)
BuilDiff: 3D Building Shape Generation using Single-Image Conditional Point Cloud Diffusion Models (readpaper.com)

image.png|666

RODIN

RODIN Diffusion (microsoft.com)
Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion (readpaper.com)

微软大数据集 + Diffusion + NeRF Tri-plane
image.png|666

Single-Image 3D Human Digitization with Shape-Guided Diffusion

Single-Image 3D Human Digitization with Shape-Guided Diffusion

利用针对一般图像合成任务预先训练的高容量二维扩散模型作为穿着人类的外观先验
通过以轮廓和表面法线为条件的形状引导扩散来修复缺失区域
image.png|666

CCD-3DR

CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction (readpaper.com)

image.png|666

$PC^2$

PC2: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction (lukemelas.github.io)
$PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction (readpaper.com)

相机位姿???
image.png|666

DiT-3D

DiT-3D论文阅读笔记
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation (readpaper.com)

image.png|666

Make-It-3D

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

image.png|666

DreamGaussian

DreamGaussian

Gaussian Splatting + Diffusion
image.png|666

Wonder3D

Wonder3D: Single Image to 3D using Cross-Domain Diffusion (xxlong.site)

Diffusion 一致性出图 + Geometry Fusion (novel geometric-aware optimization scheme)

image.png|666

CLIP

openai/CLIP: CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image (github.com)

对比语言-图片预训练模型
image.png|666

GenNeRF

Generative Neural Fields by Mixtures of Neural Implicit Functions (arxiv.org)

image.png|666

LDM3D-VR

LDM3D-VR: Latent Diffusion Model for 3D VR (arxiv.org)
视频演示T.LY URL Shortener

从给定的文本提示生成图像和深度图数据,此外开发了一个 DepthFusion 的应用程序,它使用生成的 RGB 图像和深度图来使用 TouchDesigner 创建身临其境的交互式 360°视图体验
image.png|666

Control3D

Control3D: Towards Controllable Text-to-3D Generation

草图+文本条件生成 3D

image.png|666

Etc

Colored PC 3D Colored Shape Reconstruction from a Single RGB Image through Diffusion (readpaper.com)

GAN

3DHumanGAN

3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping

多视图一致的人体照片生成

image.png|666

SE-MD

SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images. (readpaper.com)
单编码器—>多解码器
每个解码器生成某些固定视点,然后融合所有视点来生成密集的点云

image.png|666

image.png|666

FlowGAN

FlowGAN论文阅读笔记
Flow-based GAN for 3D Point Cloud Generation from a Single Image (mpg.de)
Flow-based GAN for 3D Point Cloud Generation from a Single Image (readpaper.com)

image.png|555

SG-GAN

SG-GAN论文阅读笔记
SG-GAN: Fine Stereoscopic-Aware Generation for 3D Brain Point Cloud Up-sampling from a Single Image (readpaper.com)

image.png|666

3D Brain Reconstruction and Complete

3D Brain Reconstruction by Hierarchical Shape-Perception Network from a Single Incomplete Image. (readpaper.com)

image.png|666

Deep3DSketch+

Deep3DSketch+: Rapid 3D Modeling from Single Free-Hand Sketches (readpaper.com)

image.png|666

Reality3DSketch

[2310.18148] Reality3DSketch: Rapid 3D Modeling of Objects from Single Freehand Sketches (arxiv.Org)

image.png|666

Deep3DSketch+\+

[2310.18178] Deep3DSketch++: High-Fidelity 3D Modeling from Single Free-hand Sketches (arxiv.Org)

image.png|666

GET3D

nv-tlabs/GET3D (github.com)
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images (nv-tlabs.github.io)

image.png|666

AG3D

AG3D: Learning to Generate 3D Avatars from 2D Image Collections (zj-dong.github.io)

image.png|666

NFs (Normalizing Flows)

PointFlow

stevenygd/PointFlow: PointFlow : 3D Point Cloud Generation with Continuous Normalizing Flows (github.com)
PointFlow: 3D Point Cloud Generation with Continuous Normalizing Flows (readpaper.com)

image.png|666

Other

Automatic Reverse Engineering

Automatic Reverse Engineering: Creating computer-aided design (CAD) models from multi-view images (readpaper.com)

多视图图像生成 CAD 命令序列
局限性:

  • CAD 序列的长度仍然局限于 60 个命令,因此只支持相对简单的对象
  • 表示仅限于平面和圆柱表面,而许多现实世界的对象可能包括更灵活的三角形网格或样条表示

image.png|666

SimIPU

zhyever/SimIPU: [AAAI 2021] Official Implementation of “SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations” (github. Com)

雷达点云+图片
image.png|666

3DRIMR

3DRIMR: 3D Reconstruction and Imaging via mmWave Radar based on Deep Learning. (readpaper.com)
MmWave Radar + GAN
image.png|666

TransHuman

TransHuman论文阅读笔记
TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering (pansanity666.github.io)

ImplicitFunction(NeRF)
image.png|666

Nvdiffrec

网格优化比 mlp 优化难,速度慢 from NeRF wechat

  • 还有一个 nvdiffrcmc,效果可能好一些 Shape, Light, and Material Decomposition from Images using Monte Carlo Rendering and Denoising
  • 后续还有个 NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support,应该比 nvdiffrec 要好

NVlabs/nvdiffrec: Official code for the CVPR 2022 (oral) paper “Extracting Triangular 3D Models, Materials, and Lighting From Images”. (github.com)

image.png|666

HyperHuman

高质量人类图片
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion (snap-research.github.io)

image.png|666

SyncDreamer

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image (liuyuan-pal.github.io)

多视图一致的图片生成
image.png|666

One-2-3-45

One-2-3-45

MVS+NeRF
image.png|666

One-2-3-45++

One-2-3-45++ (sudo-ai-3d.github.io)

==提供了一个可以图片生成3D 资产的 demo== sudoAI

一致性图像生成 + CLIP + 3D Diffusion Model

image.png|666

Zero-1-to-3

Zero-1-to-3: Zero-shot One Image to 3D Object (columbia.edu)

多视图一致 Diffusion Model + NeRF

image.png|666

Pix2pix3D

pix2pix3D: 3D-aware Conditional Image Synthesis (cmu.edu)

一致性图像生成+NeRF

image.png|666

Consistent4D

Consistent4D (consistent4d.github.io)

单目视频生成 4D 动态物体,Diffusion Model 生成多视图(时空)一致性的图像
image.png|666

ConRad

ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image
ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image (readpaper.com)

多视图一致
image.png|666

LRM

LRM: Large Reconstruction Model for Single Image to 3D (scalei3d.github.io)

大模型 Transformer(5 亿个可学习参数) + 5s 单视图生成 3D

image.png|666

Instant3D

Instant3D: Instant Text-to-3D Generation (ming1993li.github.io)

文本生成 3D
image.png|666

LucidDreamer

EnVision-Research/LucidDreamer: Official implementation of “LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching” (github.com)

image.png|666

HumanGaussian

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting (alvinliu0.github.io)

Gaussian Splatting 文本生成 3DHuman

image.png|666

Surf-D

[2311.17050] Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models (arxiv.Org)

高质量拓扑 3D 衣服生成

image.png|666

HumanRef

[2311.16961] HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion (arxiv.Org)

单视图 Diffusion 人体生成
image.png|666

RichDreamer

RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D (lingtengqiu.github.io)

文本生成 3D
image.png|666

TriplaneGaussian

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers (zouzx.github.io)

单视图生成3D

image.png|666

Mosaic-SDF

Mosaic-SDF (lioryariv.github.io)

一种新的三维模型的表示方法:M-SDF,基于Flow的生成式方法
image.png|666

PI3D

[2312.09069] PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion (arxiv.org)

image.png|666

Welcome to my other publishing channels