The End Of Knowledge - Vault 6/43 - CVPR - 2019 - Agents that set measurable goals for themselves

graph LR classDef main fill:#f9d4f9, font-weight:bold, font-size:14px classDef robotics fill:#f9d4d4, font-weight:bold, font-size:14px classDef learning fill:#d4f9d4, font-weight:bold, font-size:14px classDef meta fill:#d4d4f9, font-weight:bold, font-size:14px classDef unsupervised fill:#f9f9d4, font-weight:bold, font-size:14px classDef applications fill:#d4f9f9, font-weight:bold, font-size:14px Main[Agents that set
measurable goals for
themselves] --> A[Robotics and
Real-world AI] Main --> B[Learning Approaches] Main --> C[Meta-learning] Main --> D[Unsupervised Learning] Main --> E[Applications and
Implementations] A --> A1[Agents learn generalizable
skills using robots 1] A --> A2[Robots solve real-world
generalization problems 2] A --> A3[Robots study AI in
real-world complexities 23] A --> A4[Self-supervised data key
for generalization 24] A --> A5[Autonomous interaction promising
for generalization 26] A --> A6[Combined methods improve
real robot learning 27] B --> B1[Broad data improves
generalization 3] B --> B2[Data scalability impacts
generalization most 4] B --> B3[Self-supervised learning enables
goal proposing 5] B --> B4[Self-supervised learning proposes
and measures goals 18] B --> B5[Combining methods improves
generalization and learning 20] B --> B6[Unsupervised pre-training accelerates
supervised learning 28] C --> C1[Meta-learning learns fast-learning
representations 6] C --> C2[Broad task distributions
challenge meta-learning 7] C --> C3[Unsupervised meta-learning improves
few-shot learning 8] C --> C4[CACTUS outperforms unsupervised
learning alone 9] C --> C5[Unsupervised meta-RL accelerates
new task learning 10] C --> C6[Improved unsupervised meta-RL
alternates models 11] D --> D1[Closes loop between
proposal and meta-learning 12] D --> D2[Unsupervised approaches promising
for self-supervised learning 19] D --> D3[Agents constructing curricula
scales meta-learning 21] D --> D4[Single aligned method
outperforms multiple methods 29] D --> D5[Visual goal reaching
struggles with representations 13] D --> D6[DPN uses trajectory
optimization for metrics 14] E --> E1[DPN succeeds in
various manipulation tasks 15] E --> E2[Supervision improves learned
goal metrics 16] E --> E3[Enables real-world policy
learning from pixels 17] E --> E4[Reward functions learned
for deformable object manipulation 22] E --> E5[Learned metrics outperform
standard approaches 25] E --> E6[Embedding planning enables
minimal-supervision behaviors 30] class Main main class A,A1,A2,A3,A4,A5,A6 robotics class B,B1,B2,B3,B4,B5,B6 learning class C,C1,C2,C3,C4,C5,C6 meta class D,D1,D2,D3,D4,D5,D6 unsupervised class E,E1,E2,E3,E4,E5,E6 applications

Resume:

1.- Chelsea Finn discusses building agents that can learn generalizable skills in real-world settings using robots.

2.- Robots face real-world complexities and building intelligent robots can solve important problems like generalization and self-supervised learning.

3.- Training agents on broad data, including self-supervised and weakly supervised data, leads to better generalization than focusing on specific datasets.

4.- Data scalability has a bigger impact on generalization than algorithm changes, so we should build algorithms that can handle scalable data collection.

5.- Self-supervised learning is important for enabling agents to propose their own tasks and goals and measure progress towards those goals.

6.- Meta-learning aims to learn representations that enable fast learning from small datasets by training on many small dataset tasks.

7.- Designing broad task distributions for meta-learning is challenging, so enabling agents to propose their own tasks from unlabeled data is valuable.

8.- Unsupervised meta-learning proposes tasks by clustering embeddings from unsupervised learning and meta-learns on those tasks to improve few-shot learning.

9.- This unsupervised meta-learning approach (CACTUS) improves downstream few-shot classification accuracy on mini-ImageNet compared to unsupervised learning alone.

10.- Unsupervised meta-reinforcement learning accelerates learning of new tasks in an environment by proposing tasks through random or diversity-driven approaches.

11.- An improved unsupervised meta-RL approach alternates between fitting a generative model to skills and meta-training to maximize mutual information.

12.- This closes the loop between task proposal and meta-learning, enables density-based exploration, and scales to visual observations.

13.- For visual goal reaching tasks, pixel distance, VAE distance, and inverse models struggle to capture the right representation.

14.- A new approach, distributional planning networks (DPN), uses trajectory optimization in a learned representation to acquire goal metrics through autonomous interaction.

15.- DPN is able to learn successful goal distance metrics for simulated reaching, rope manipulation, pushing and real-world tasks.

16.- Incorporating a small amount of supervision by training a classifier and actively querying for labels can further improve the learned metric.

17.- This enables efficiently learning policies for real-world pushing and cloth draping tasks from raw pixels through reinforcement learning.

18.- Two key elements of self-supervised learning are enabling agents to propose their own tasks/goals and to measure progress towards goals.

19.- Unsupervised task proposals and unsupervised goal metrics acquired through interaction are promising approaches for self-supervised learning.

20.- Combinations of unsupervised learning, task proposals, and meta-learning can improve generalization and few-shot learning.

21.- Enabling agents to construct their own task curricula helps scale meta-learning to broader task distributions without manual task design.

22.- Reward functions can be learned from raw pixels for challenging deformable object manipulation tasks.

23.- Robots provide a platform to study artificial intelligence that must handle real-world complexities and generalize broadly.

24.- Self-supervised and weakly-supervised data is key for training agents that can generalize as the real world requires.

25.- Learned goal metrics outperform standard approaches like pixel distance, VAEs, and inverse models for visual goal reaching.

26.- Learning through autonomous interaction, with minimal human supervision, is a promising path to agents that can generalize.

27.- Incorporating unsupervised learning, curriculum proposals, and meta-learning improves sample efficiency for real robots learning from raw pixels.

28.- Unsupervised pre-training may not always capture all semantically relevant aspects, but can still greatly accelerate downstream supervised learning.

29.- Combining multiple unsupervised learning methods did not show benefits over using the single method most aligned with the downstream tasks.

30.- Embedding planning and reinforcement learning in representation learning enables agents to acquire goal reaching behaviors with minimal supervision.

Knowledge Vault built byDavid Vivancos 2024