DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation

Jeong Joon Park; Peter Florence; Julian Straub; Richard Newcombe; Steven Lovegrove

**Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:**

graph LR
classDef deepsdf fill:#f9d4d4, font-weight:bold, font-size:14px
classDef representation fill:#d4f9d4, font-weight:bold, font-size:14px
classDef performance fill:#d4d4f9, font-weight:bold, font-size:14px
classDef applications fill:#f9f9d4, font-weight:bold, font-size:14px
classDef related fill:#f9d4f9, font-weight:bold, font-size:14px
A[DeepSDF: Learning Continuous

Signed Distance Functions

for Shape Representation] --> B[DeepSDF: regresses SDFs

with NNs. 1] A --> C[Deconvolutional nets: grow

for voxels. 2] A --> D[Point clouds: compact,

no surfaces. 3] A --> E[Triangle meshes: unknown

vertices, topologies. 4] A --> F[SDF: volumetric field,

distance to surface. 5] B --> G[FC NN: XYZ in,

SDF out. 6] B --> H[Latent code Z:

encodes shape. 7] B --> I[Autodecoder: learns space

sans encoder. 8] I --> J[Training: random code

per shape, SDFs. 9] I --> K[Latent space: obtained

post-training. 10] B --> L[Inference: optimal code

via gradient descent. 11] L --> M[Arbitrary SDF samples:

e.g. depth map. 12] L --> N[Visualization: code matches

depth observation. 13] B --> O[Rendering: ray casting,

gradient normals. 14] B --> P[Marching cubes: mesh

from SDF. 15] B --> Q[Performance: outperforms voxels,

meshes. 16] Q --> R[Network size: 100x

smaller than octrees. 17] Q --> S[Expressive power: exceeds

mesh SOTAs. 18] B --> T[Shape completion: optimal

from depth map. 19] A --> U[Related CVPR works:

for reference. 20] class A,B deepsdf class C,D,E,F representation class G,H,I,J,K,L,M,N,O,P,Q,R,S performance class T applications class U related

Signed Distance Functions

for Shape Representation] --> B[DeepSDF: regresses SDFs

with NNs. 1] A --> C[Deconvolutional nets: grow

for voxels. 2] A --> D[Point clouds: compact,

no surfaces. 3] A --> E[Triangle meshes: unknown

vertices, topologies. 4] A --> F[SDF: volumetric field,

distance to surface. 5] B --> G[FC NN: XYZ in,

SDF out. 6] B --> H[Latent code Z:

encodes shape. 7] B --> I[Autodecoder: learns space

sans encoder. 8] I --> J[Training: random code

per shape, SDFs. 9] I --> K[Latent space: obtained

post-training. 10] B --> L[Inference: optimal code

via gradient descent. 11] L --> M[Arbitrary SDF samples:

e.g. depth map. 12] L --> N[Visualization: code matches

depth observation. 13] B --> O[Rendering: ray casting,

gradient normals. 14] B --> P[Marching cubes: mesh

from SDF. 15] B --> Q[Performance: outperforms voxels,

meshes. 16] Q --> R[Network size: 100x

smaller than octrees. 17] Q --> S[Expressive power: exceeds

mesh SOTAs. 18] B --> T[Shape completion: optimal

from depth map. 19] A --> U[Related CVPR works:

for reference. 20] class A,B deepsdf class C,D,E,F representation class G,H,I,J,K,L,M,N,O,P,Q,R,S performance class T applications class U related

**Resume: **

**1.-** DeepSDF: Directly regresses continuous signed distance functions using neural networks for efficient and expressive shape representation.

**2.-** Deconvolutional networks: Commonly used for image-based approaches but grow quickly in space and time when applied to voxels.

**3.-** Point clouds: More compact representations than voxels but do not describe surfaces.

**4.-** Triangle meshes: Have unknown number of vertices and topologies.

**5.-** Signed distance function (SDF): Volumetric field where magnitude is distance to closest surface and sign indicates inside/outside. Shape is zero level set.

**6.-** Fully connected neural network: Takes XYZ coordinate as input and outputs predicted SDF value.

**7.-** Latent code (Z): Encodes shape information interpreted by decoder network. Conditioned on Z to model dataset of shapes.

**8.-** Autodecoder: Learning scheme to obtain meaningful latent space without encoder. Codes and decoder weights jointly optimized.

**9.-** Training: Random code initialized per shape, attached to XYZ input. Optimized with decoder weights given ground truth SDFs.

**10.-** Latent space of shapes: Obtained after training autodecoder.

**11.-** Inference: Optimal code found via gradient descent to best explain input shape. Decoder weights frozen.

**12.-** Arbitrary SDF samples: Autodecoder allows inference on any number of samples, e.g. from single depth map.

**13.-** Visualization of inference: Optimization finds best code matching depth map observation.

**14.-** Rendering: Ray casting to zero crossing for depth map. Surface normals via backpropagation gradients.

**15.-** Marching cubes: Algorithm to extract mesh from SDF.

**16.-** Shape representation performance: DeepSDF significantly outperforms previous voxel and mesh-based methods on unseen shapes.

**17.-** Network size efficiency: 100x smaller than octree voxel methods while providing higher accuracy and surface normals.

**18.-** Expressive power: Much higher than state-of-the-art mesh-based methods.

**19.-** Shape completion: Finds optimal high-quality shape given input depth map. Outperforms state-of-the-art.

**20.-** Related CVPR works: Mentioned for further reference.

Knowledge Vault built byDavid Vivancos 2024