The End Of Knowledge - Vault 5/88 - CVPR - 2023 - Planning-oriented Autonomous Driving

graph LR classDef unified fill:#f9d4d4, font-weight:bold, font-size:14px classDef autonomous fill:#d4f9d4, font-weight:bold, font-size:14px classDef solutions fill:#d4d4f9, font-weight:bold, font-size:14px classDef uniad fill:#f9f9d4, font-weight:bold, font-size:14px classDef future fill:#f9d4f9, font-weight:bold, font-size:14px A[Planning-oriented Autonomous Driving] --> B[UniAD: Full-stack autonomous
driving framework 1] A --> C[Autonomous driving: Weather,
scenarios, tasks 2] A --> D[Typical solutions: Standalone
models, errors 3] A --> E[Multitask frameworks: Shared
backbone, uncoordinated 4] A --> F[End-to-end: Policy from
sensors, uninterpretable 5] A --> G[UniAD: Integrate perception,
prediction, planning 6] G --> H[UniAD tasks: Formers,
occupancy, planner 7] G --> I[Unified query: Connects,
coordinates tasks 8] G --> J[Transformer modules: Complex
interactions, attention 9] H --> K[Track, map formers:
End-to-end training 10] H --> L[Motion former: Diverse
relation modeling 11] H --> M[Occupancy former: Predicts
expectations, restricts 12] H --> N[Planner: Attends features,
adjusts path 13] G --> O[Two-phase training: Stabilizes,
shares results 14] G --> P[Experiments: Validate tasks,
benefit planning 15] G --> Q[Planning: Lowest error,
collision rate 16] G --> R[Interpretability: Visualizations, recovers
from errors 17] G --> S[Unified query: Connects,
coordinates tasks 18] B --> T[Results: State-of-the-art vision-only 19] A --> U[Future: Data, training,
algorithms, systems 20] A --> V[Foundation model potential:
UniAD principles 21] A --> W[Applications: Robotics, navigation,
autonomous tasks 22] A --> X[UniAD: Step towards
foundation model 23] A --> Y[Paper, poster available
for details 24] A --> Z[Q&A: Submit questions
via box 25] class B,G,H,I,J,K,L,M,N,O,P,Q,R,S,T unified class C autonomous class D,E,F solutions class U,V,W,X,Y,Z future

Resume:

1.- UniAD: Unified full-stack autonomous driving framework coordinating perception, prediction, and planning tasks for safe driving.

2.- Autonomous driving challenges: Various weather, illuminations, and scenarios; tasks include perception, prediction, and planning.

3.- Typical solutions: Standalone models trained independently for each task, leading to accumulated errors.

4.- Multitask frameworks: Shared backbone for multiple tasks, efficient but lacks coordination between task heads.

5.- Vanilla end-to-end solutions: Learn policy directly from sensor inputs, good simulator results but lack interpretability in real-world scenarios.

6.- UniAD's approach: Integrate safety-critical perception and prediction tasks, organize in a hierarchy to maximize information flow to the planner.

7.- UniAD's tasks: Track former, map former, motion former, occupancy former, and planner.

8.- Unified query design: Connects the entire pipeline and coordinates all tasks towards planning.

9.- Transformer-based task modules: Model complex interactions in driving scenes with attention mechanisms.

10.- Track former and map former: Developed from previous research, treat agents and map elements as queries for end-to-end training.

11.- Motion former: Handles diverse relation modeling with attention mechanisms (agent-agent, agent-map, agent-ego relations).

12.- Occupancy former: Predicts occupancy expectations and restricts interactions between agents and their corresponding BEV features.

13.- Planner: Uses ego-vehicle query to attend BEV features, predicts future waypoints, and adjusts path to avoid potential collisions.

14.- Two-phase training: Stabilizes the training process and shares matching results across modules for convergence.

15.- Experiments: Validate the necessity of preceding tasks, showing they benefit each other and final planning.

16.- Planning performance: UniAD achieves the lowest L2 error and collision rate, outperforming ladder-based and previous end-to-end methods.

17.- Interpretability: Visualization of intermediate representations exhibits UniAD's interpretability and ability to recover from upstream errors.

18.- Unified query design: Connects and coordinates all tasks in the framework.

19.- Results: UniAD achieves state-of-the-art results on all investigated tasks with vision-only inputs.

20.- Future directions: Data and training strategy, shippable algorithms, and closed-loop systems.

21.- Foundation model for autonomous driving: Potential for a universal foundation model based on UniAD's principles and structures.

22.- Applications: Extending to a broad range of robotics, enabling machines to interact, navigate, and perform tasks autonomously and intelligently.

23.- Conclusion: UniAD is a step towards a foundation model for autonomous driving, opening up new possibilities in robotics.

24.- Additional information: Paper and poster session available for more details.

25.- Q&A: Speakers encourage questions through the Q&A box at the bottom of the screen.

Knowledge Vault built byDavid Vivancos 2024