Knowledge Vault 5 /26 - CVPR 2017
Annotating Object Instances with a Polygon-RNN
Lluís Castrejón, Kaustav Kundu, Raquel Urtasun, & Sanja Fidler
< Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:

graph LR classDef instanceSegmentation fill:#f9d4d4, font-weight:bold, font-size:14px classDef polygonRNN fill:#d4f9d4, font-weight:bold, font-size:14px classDef annotations fill:#d4d4f9, font-weight:bold, font-size:14px classDef experiments fill:#f9f9d4, font-weight:bold, font-size:14px A[Annotating Object Instances
with a Polygon-RNN] --> B[Instance segmentation: regions,
object instances, annotation time-consuming. 1] B --> C[Polygon-RNN: interactive model,
polygons, user modifications. 2] C --> D[Polygons: natural, sparse
representation, easy incorporation. 3] C --> E[Polygon-RNN: CNN features,
convolutional LSTM, vertices. 4] C --> F[User corrections: vertex
selection, model updates. 5] A --> G[Annotation: bounding box, polygon,
corrections, segmentation. 6] A --> H[Experiments: automatic segmentation,
simulated user corrections. 7] H --> I[Polygon-RNN outperforms baselines
on Cityscapes dataset. 8] H --> J[Simulated corrections: human-level
agreement, fewer clicks. 9] H --> K[KITTI generalization: estimated human
agreement, <6 clicks. 10] A --> L[Polygon-RNN enables cheap annotation,
automatic+user interaction. 11] class A,B instanceSegmentation class C,D,E,F polygonRNN class G,L annotations class H,I,J,K experiments


1.- Instance segmentation involves determining image regions belonging to individual object instances. Manually annotating instances is time-consuming.

2.- Polygon-RNN is an interactive instance segmentation model that generates polygons and accepts user modifications to improve predictions.

3.- Polygons are a natural, sparse representation for annotating instances, allowing easy incorporation of user modifications by adding/deleting/moving vertices.

4.- Polygon-RNN extracts CNN image features at different levels, uses a convolutional LSTM to predict polygon vertices sequentially.

5.- Users can correct predicted vertices by selecting new locations. Corrections are fed into the model to update predictions.

6.- Annotation process: Draw bounding box, model generates polygon, user corrects vertices if needed, model generates refined segmentation.

7.- Experiments conducted on automatic instance segmentation (no user interaction) and annotation with simulated user corrections.

8.- Polygon-RNN outperforms DeepMask and SharpMask baselines for automatic instance segmentation on Cityscapes dataset.

9.- With simulated user corrections, Polygon-RNN achieves human-level annotation agreement while requiring 5x fewer clicks compared to manual annotation.

10.- Polygon-RNN generalizes to KITTI dataset without fine-tuning, reaching estimated human agreement with <6 clicks per instance on average.

11.- Polygon-RNN enables cheap annotation of new instance segmentation datasets by combining automatic prediction with easy user interaction.

Knowledge Vault built byDavid Vivancos 2024