The End Of Knowledge - Vault 5/45 - CVPR - 2019 - Panoptic Feature Pyramid Networks

Knowledge Vault 5 /45 - CVPR 2019

Panoptic Feature Pyramid Networks

Alexander Kirillov; Ross Girshick; Kaiming He; Piotr Dollár

<

Resume Image

>

Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:

graph LR classDef panoptic fill:#f9d4d4, font-weight:bold, font-size:14px classDef semantic fill:#d4f9d4, font-weight:bold, font-size:14px classDef instance fill:#d4d4f9, font-weight:bold, font-size:14px classDef datasets fill:#f9f9d4, font-weight:bold, font-size:14px classDef architectures fill:#f9d4f9, font-weight:bold, font-size:14px A[Panoptic Feature Pyramid
Networks] --> B[Panoptic: labels pixels, splits instances. 1] A --> C[Semantic: labels each pixel. 2] A --> D[Instance: masks things objects. 3] A --> E[Datasets: annotations, challenges, leaderboards. 4] A --> F[Independent networks: semantic, instance. 5] F --> G[Inefficient compute/memory, end-to-end harder. 5] A --> H[PanopticFPN: unified semantic, instance. 6] H --> I[FPN: multi-scale feature maps. 7] H --> J[Mask R-CNN: instance head. 8] H --> K[Pixel-wise: semantic head. 9] J --> L[RBR: strong instance performance. 10] K --> M[PLR: efficient semantic segmentation. 11] H --> N[Simultaneous instance, semantic segmentation. 12] A --> O[Datasets: COCO, Cityscapes. 13] H --> P[Outperforms independent networks. 14] A --> Q[PanopticFPN: simple, efficient baseline. 15] class A,B panoptic class C semantic class D instance class E,O datasets class F,G,H,I,J,K,L,M,N,P,Q architectures

Resume:

1.- Panoptic segmentation: Assigns semantic label to each pixel and splits instances of same class into different segments. Combines semantic and instance segmentation.

2.- Semantic segmentation: Assigns semantic label to each pixel in the image.

3.- Instance segmentation: Delineates objects of "things" classes with masks. Predicted masks used for further analysis.

4.- Panoptic segmentation datasets: Modern datasets have ground truth annotations. Challenges and leaderboards exist.

5.- Combining two independent networks: Straightforward approach using best semantic and instance segmentation architectures. Inefficient compute/memory, harder for end-to-end system.

6.- Panoptic Feature Pyramid Networks (PanopticFPN): Unified architecture producing semantic and instance segmentation simultaneously from single backbone.

7.- Feature Pyramid Network (FPN) backbone: Produces feature maps at different spatial resolutions, used for instance and semantic heads.

8.- Mask R-CNN head for instance segmentation: Strong architecture for instance segmentation, called Region-based Recognition (RBR) head.

9.- Simple pixel-wise head for semantic segmentation: Processes each scale's feature maps independently, sums them, predicts final scores. Called Pixel-level Recognition (PLR) head.

10.- Competitive instance segmentation performance: RBR head on FPN backbone performs on par with well-known methods like DeepLabV3/V3+.

11.- Efficient semantic segmentation: PLR head avoids dilations to preserve spatial resolution, making it computationally and memory efficient.

12.- Simultaneous instance and semantic segmentation: Unified PanopticFPN architecture with single backbone and two heads.

13.- Datasets evaluated: COCO and Cityscapes.

14.- Comparison to independent networks: With same compute budget, PanopticFPN outperforms Mask R-CNN and SemanticFPN. Higher panoptic quality.

15.- Strong panoptic segmentation baseline: PanopticFPN expected to be used as baseline for future panoptic methods due to simplicity and efficiency.

Knowledge Vault built byDavid Vivancos 2024