Knowledge Vault 6 /1 - ICML 2015
Two high stakes challenges in machine learning
Léon Bottou
< Resume Image >

Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:

graph LR classDef main fill:#f9d4d4, font-weight:bold, font-size:14px classDef intro fill:#d4f9d4, font-weight:bold, font-size:14px classDef policysearch fill:#d4d4f9, font-weight:bold, font-size:14px classDef gradients fill:#f9f9d4, font-weight:bold, font-size:14px classDef em fill:#f9d4f9, font-weight:bold, font-size:14px classDef advanced fill:#d4f9f9, font-weight:bold, font-size:14px Main[Two high stakes
challenges in machine
learning] Main --> A[Introduction and Motivation] A --> A1[Autonomous robots need complex
skill learning 1] A --> A2[Challenges: high-dimensional spaces, data
costs, safety 2] A --> A3[Value-based RL: unstable, extensive
exploration 3] A --> A4[Policy search: parameterized, correlated,
local updates 4] A --> A5[Taxonomy: model-free vs model-based
methods 5] A --> A6[Outline: taxonomy, methods, extensions,
model-based 6] Main --> B[Policy Search Fundamentals] B --> B1[Policy representations: trajectories, controllers,
networks 7] B --> B2[Model-free vs model-based: samples
vs learning 8] B --> B3[Step vs episode-based exploration:
action/parameter space 9] B --> B4[Policy update: direct optimization
or EM 10] B --> B5[Exploration: balance smoothness and
variability 11] B --> B6[Correlated parameter exploration yields
smoother trajectories 12] Main --> C[Policy Gradient Methods] C --> C1[Conservative vs greedy updates:
exploration-exploitation tradeoff 13] C --> C2[Policy gradients: log-likelihood trick
estimates gradient 14] C --> C3[Baseline subtraction reduces variance
without bias 15] C --> C4[Step-based gradients use state-action
value function 16] C --> C5[State-dependent baseline further reduces
variance 17] C --> C6[Metric choice impacts update
step size 18] Main --> D[Advanced Policy Gradient Techniques] D --> D1[Natural gradients: Fisher information
normalizes gradient 19] D --> D2[Natural actor-critic: gradients with
function approximation 20] D --> D3[State-value function reduces advantage
function variance 21] D --> D4[Policy gradients learn motor
skills slowly 22] Main --> E[Expectation-Maximization Methods] E --> E1[EM-based search: reward-weighted maximum
likelihood 23] E --> E2[EM works for step/episode-based settings 24] E --> E3[Reward weighting: baseline subtraction,
rescaling 25] E --> E4[Moment projection: KL minimization,
closed-form updates 26] Main --> F[Advanced Topics and Applications] F --> F1[Applications: complex robot skills
forthcoming 27] F --> F2[Contextual search learns generalizable,
adaptable skills 28] F --> F3[Hierarchical search: high-level sequencing,
low-level primitives 29] F --> F4[Model-based search: PILCO, guided
policy search 30] class Main main class A,A1,A2,A3,A4,A5,A6 intro class B,B1,B2,B3,B4,B5,B6 policysearch class C,C1,C2,C3,C4,C5,C6,D,D1,D2,D3,D4 gradients class E,E1,E2,E3,E4 em class F,F1,F2,F3,F4 advanced

Resume:

1.- Challenges: software engineering and experimentation.

2.- Abstraction helps manage engineering complexity.

3.- Abstractions can leak, requiring deeper understanding.

4.- Math abstractions don't leak, aiding design.

5.- Software built on clean abstractions.

6.- Programming vs. learning: different computing approaches.

7.- Perceptrons lost to programming initially.

8.- Humans excel where specifications are elusive.

9.- ML needs software to have impact.

10.- Trained models make weak software components.

11.- Learning algorithms entangle complex systems.

12.- Examples illustrate integration problems with ML.

13.- ML in software: challenges remain.

14.- ML mixes science and engineering aspects.

15.- ML lacks specifications, relies on data.

16.- ML relies on single experiment paradigm.

17.- Single paradigm contrasts other sciences.

18.- Datasets have bias, can't be curated.

19.- Training data never covers all cases.

20.- Models fail on unseen edge cases.

21.- Computer vision isn't purely statistical.

22.- Evaluating AI-like tasks is difficult.

23.- Rethink experiment paradigm for ML progress.

24.- ML challenges are about process.

25.- ML engineering may prioritize productivity.

26.- Targeted experiments could reveal model reasoning.

27.- Diverse experiments, discussing limits openly.

28.- Contracts could make ML more robust.

29.- Reusing ML work remains challenging.

30.- Key challenges shape ML's future impact.

Knowledge Vault built byDavid Vivancos 2024