Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:
Resume:
1.- Optical flow estimates 2D motion vectors for every pixel between frames.
2.- FlowNet2 CNN performs close to state-of-the-art, much faster.
3.- Ideal algorithm would outperform state-of-the-art while being fast.
4.- Performance correlates with model size for published CNN optical flow models.
5.- PWC-Net is compact but outperforms state-of-the-art by leveraging domain knowledge.
6.- Brightness constancy - pixel retains brightness despite position change over time.
7.- Exhaustive patch comparison between frames via normalized cross-correlation reveals true motion.
8.- Cost volume stores patch similarity for all motion vectors per pixel.
9.- Correlation has some invariance to color changes.
10.- Cost volume used for stereo (1D search) but not flow (2D search) due to computation.
11.- Aperture problem - local patch ambiguity requires careful patch size selection.
12.- PWC-Net constructs cost volumes at multiple resolutions using feature pyramids.
13.- Always uses small search range in cost volume construction.
14.- Feature pyramid has large receptive field at smallest resolution (16x8).
15.- Cost volume constructed by correlating features, not raw pixels.
16.- Concatenates cost volume with features, uses CNN to estimate flow.
17.- Upsamples and rescales flow to next pyramid level.
18.- Warping aligns second image to first using upsampled flow.
19.- Smaller motion between first and warped second image.
20.- Warps features, not raw images, to propagate information through pyramid.
21.- Constructs cost volume at each pyramid level using small search range.
22.- Feature pyramid, feature warping, cost volume at each level all contribute significantly.
23.- Compact model performs competitively with state-of-the-art.
24.- Data augmentation (no Gaussian noise, horizontal flipping) critical for small datasets.
25.- Won Robust Vision Challenge optical flow track.
26.- Code available on GitHub.
27.- TVNet converts classic TV-L1 optimization into CNN.
28.- PWC-Net and TVNet share spirit of encoding domain knowledge into network.
29.- PWC-Net principles: feature pyramids, feature warping, cost volume with small search range.
30.- Constructing cost volumes computationally affordable at coarse resolutions.
Knowledge Vault built byDavid Vivancos 2024