Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:
Resume:
1.-The goal is to understand computational mechanisms needed for general-purpose intelligent robots that can handle variability in environments and tasks.
2.-Robot policies can be represented as a program Pi that maps action/observation history to next action, optimized for a domain distribution.
3.-Simple domains allow for simple policies, while complex/uncertain domains require more general/adaptive policies. Finding the optimal policy is technically challenging.
4.-Policies can be represented in various ways, e.g. as raw policies, value functions, planners with transition models, hierarchical abstraction.
5.-Robots can be engineered for narrow known tasks or require learning/adaptation for broader uncertain task distributions.
6.-Classical engineering works for known narrow tasks. RL in simulation can compile simulators into policies for moderately complex tasks.
7.-Online planning allows handling longer horizons by re-planning, e.g. AlphaZero. Hierarchical planning enables very complex tasks.
8.-Various components can enable complex robot behavior: perception, planning, hierarchical execution, low-level control. HPN system demonstrates this without learning.
9.-Experience is very costly when learning online in the real world. Priors/biases are needed to learn efficiently from limited samples.
10.-One approach is to learn new skills, perceptual capabilities, transition models to expand a system like HPN through learning.
11.-Learning pre-image models allows integrating a new primitive skill (e.g. pouring) as an operator in a task and motion planner.
12.-Gaussian process regression enables learning the "mode" in which a skill like pouring will succeed from few samples.
13.-Learning the full level-set of successful parameters enables flexibility, e.g. pouring from different grasps if nominal is infeasible.
14.-Compiling learned operators into partial policies can provide a balance between flexibility of planning and efficiency of reactive skills.
15.-Lifted partial policies enable generalization, e.g. learning a policy for putting any object into any box at an abstract level.
16.-Learning search control knowledge, e.g. with graph neural nets and relational features, can guide planning in large combinatorial spaces.
17.-Generalization across meaningfully different tasks is possible with the right state abstraction, e.g. clearing access to an object.
18.-Human insight is still needed to provide useful algorithmic and structural biases for robot learning systems, especially in complex domains.
19.-Key biases include: algorithms like planning, architectures like hierarchies, state abstractions like objects, learning structures like convolutions.
20.-Over past decades, major progress in ML/RL but autonomous robots require additional structure to learn efficiently in real costly environments.
Knowledge Vault built byDavid Vivancos 2024