The End Of Knowledge - Vault 5/3 - CVPR - 2015 - DynamicFusion: Reconstruction and Tracking of Non-rigid Scenes in Real-Time

graph LR classDef fusion fill:#f9d4d4, font-weight:bold, font-size:14px classDef reconstruction fill:#d4f9d4, font-weight:bold, font-size:14px classDef representation fill:#d4d4f9, font-weight:bold, font-size:14px classDef tracking fill:#f9f9d4, font-weight:bold, font-size:14px classDef applications fill:#f9d4f9, font-weight:bold, font-size:14px A[DynamicFusion: Reconstruction and
Tracking of Non-rigid
Scenes in Real-Time] --> B[Real-time non-rigid tracking
and reconstruction 1] A --> C[Affordable commodity
depth camera 2] A --> D[Canonical frame aligned
with warp field 3] D --> E[Per-frame volumetric
deformation field 4] A --> F[Incrementally updated
volumetric reconstruction 5] A --> G[Template-free non-rigid
reconstruction 6] A --> H[Real-time incrementally
updated output 7] A --> I[Efficient volumetric
surface representation 8] I --> J[Truncated signed
distance function 9] I --> K[Surface extracted
from zero level set 10] A --> L[Volumetric 6-DoF
motion field 11] L --> M[Sparse deformation
graph interpolation 12] M --> N[Normalized dual
quaternion transforms 13] A --> O[Non-rigid tracking
cost function 14] O --> P[Dense data term
without feature matching 15] O --> Q[Looping norm for
discontinuity handling 16] A --> R[Generalized range
fusion to non-rigid 17] R --> S[Canonical point projection
into live frame 18] R --> T[Weighted SDF fusion
in canonical frame 19] R --> U[Deformation graph
node insertion 20] U --> V[Node density controlled
by epsilon threshold 21] V --> W[Coarser motion field
with fewer nodes 22] A --> X[Real-time hand
modeling application 23] A --> Y[Handles topology
changes during reconstruction 24] A --> Z[Struggles with unobserved
deformations, scaling limitations 25] A --> AA[Real-time performance via
efficient representations, GPU 27] Z --> AB[Volumetric warp field
enables non-rigid reconstruction 26] Z --> AC[Scaling and compression
challenges data association 28] A --> AD[Future work:
loop closure 30] class A,B,G,H,X,Y,Z,AA,AD applications class C,F,I,J,K reconstruction class D,E,L,M,N,AB,AC,W representation class O,P,Q,R,S,T,U,V tracking

Resume:

1.- Dynamic Fusion: Real-time reconstruction and tracking of non-rigid scenes using a single depth camera, without requiring a pre-modeled template.

2.- Commodity depth camera: Affordable, widely available depth sensing camera used for real-time 3D reconstruction.

3.- Canonical frame: Fixed reference frame to which non-rigid video frames are aligned using a warp field.

4.- Warp field: Per-frame volumetric field that describes how the observed scene surface and surrounding space deform from the canonical frame.

5.- Volumetric surface reconstruction: 3D reconstruction of the scene that is incrementally updated by undoing motion observed in each depth frame.

6.- Template-free reconstruction: Reconstructing non-rigid scenes without requiring a parameterized template of the objects being tracked.

7.- Real-time output: The system produces an incrementally updated reconstruction of the scene in real-time.

8.- Volumetric signed distance functions (SDF): Efficient surface representation for real-time updates, where each point stores the signed distance to the nearest surface.

9.- Truncated signed distance function (TSDF): Narrow band of the SDF near the surface, used for efficient storage and computation.

10.- Zero level set: The surface itself, encoded as the zero-crossing of the signed distance function, which can be extracted as a triangle mesh.

11.- Volumetric motion field: Represents scene motion as a 6-DoF rigid body transformation at each point in the canonical space.

12.- Deformation graph: Sparse set of deformation nodes used to interpolate the volumetric motion field, reducing computation and ensuring smoothness.

13.- Normalized dual quaternion: Parameterization of deformation node transforms, enabling efficient blending and reducing artifacts.

14.- Non-rigid tracking cost function: Comprises a data term (minimized when the warped model matches the live frame) and a regularization term (ensures motion field smoothness).

15.- Dense data term: Allows all data in the live frame to be used for optimization, without requiring sparse feature extraction and matching.

16.- Looping norm regularization: Ensures discontinuities can form in the motion field where supported by data, while keeping the field smooth elsewhere.

17.- Range fusion: Technique used in KinectFusion for rigid scenes, generalized to non-rigid scenes in DynamicFusion using the estimated warp field.

18.- Warped point projection: Projecting a canonical point into the live depth map using the estimated warp field to obtain an SDF observation.

19.- Weighted SDF fusion: Fusing observed SDF values in the canonical frame using the estimated warp field, as if updating a tiny rigid volume.

20.- Deformation graph node insertion: Adding new nodes to the deformation graph to accurately represent motion over newly reconstructed surface areas.

21.- Epsilon distance threshold: Determines the density of deformation graph nodes based on their distance from the nearest existing node.

22.- Coarser motion field: Resulting from increasing the epsilon distance threshold, leading to fewer transformation nodes and simpler motion representation.

23.- Hand modeling application: Using DynamicFusion for real-time modeling of small non-rigid objects manipulated by hands.

24.- Topology changes: DynamicFusion can handle changes in scene topology, such as open-to-closed-to-open deformations, during continuous reconstruction.

25.- Limitations: DynamicFusion may struggle with deformations not observed in the data, posing challenges for scaling to larger scenes.

26.- Volumetric warp field estimation: Key to enabling real-time non-rigid reconstruction by generalizing range fusion approaches to non-rigid scenarios.

27.- Real-time performance: Achieved through efficient representations (TSDF, deformation graphs) and parallelizable operations on the GPU.

28.- Scaling and compression: Volumetric motion field with per-point rigid body transforms allows for surface scaling and compression.

29.- Data association challenges: When surface scaling or compression exceeds the original canonical model positions, data association may fail.

30.- Future work: Incorporating explicit loop closure to address scenarios where objects disappear and reappear in the camera view.

Knowledge Vault built byDavid Vivancos 2024