Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
graph LR
classDef video fill:#f9d4d4, font-weight:bold, font-size:14px
classDef robotics fill:#d4f9d4, font-weight:bold, font-size:14px
classDef learning fill:#d4d4f9, font-weight:bold, font-size:14px
classDef future fill:#f9f9d4, font-weight:bold, font-size:14px
classDef patterns fill:#f9d4f9, font-weight:bold, font-size:14px
Main[Learning actions, policies,
rewards, and environments
from videos alone] --> V[Video Processing]
Main --> R[Robotics & Movement]
Main --> L[Learning Methods]
Main --> F[Future States]
Main --> P[Pattern Analysis]
V --> V1[Videos teach actions without
direct supervision 1]
V --> V2[Gaming speedruns reveal
behavior trends 6]
V --> V3[Model detects animals
beyond games 8]
V --> V4[VQ-VAE tokens divide
video parts 21]
V --> V5[Video values train
learning agents 19]
R --> R1[Robots learn sign language
basics 2]
R --> R2[Robots acquire facial expression
skills 3]
R --> R3[Map human moves to
robot features 4]
R --> R4[Movement patterns across
varied subjects 5]
R --> R5[New environments create
motion guides 7]
L --> L1[Future states need
action models 22]
L --> L2[Videos teach without
reward systems 17]
L --> L3[End rewards guide
value learning 18]
L --> L4[Learning from suboptimal
video data 15]
L --> L5[Direct transitions replace
action pairs 16]
L --> L6[Data thrives without
action needs 29]
F --> F1[Predicting numerous features
simultaneously 9]
F --> F2[Shared actions through
future clustering 10]
F --> F3[Video transitions reveal
latent actions 11]
F --> F4[Model predicts possible
future states 12]
F --> F5[Future states weight
policy choices 13]
F --> F6[Fast real-world policy
adaptation 14]
P --> P1[Interactive spaces from
video content 20]
P --> P2[Models enhance world
interaction 23]
P --> P3[Images transform into
living spaces 24]
P --> P4[Testing through platform games 25]
P --> P5[Training future artificial minds 26]
P --> P6[Real videos need bigger
structures 27]
P2 --> P7[Managing rewards across
virtual worlds 28]
P3 --> P8[Structure creates basic
patterns 30]
class Main,V,V1,V2,V3,V4,V5 video
class R,R1,R2,R3,R4,R5 robotics
class L,L1,L2,L3,L4,L5,L6 learning
class F,F1,F2,F3,F4,F5,F6 future
class P,P1,P2,P3,P4,P5,P6,P7,P8 patterns
Resume:
1.- Learning actions, policies, rewards from videos without explicit supervision
2.- Initial research on teaching robots sign language gestures
3.- Pivot to teaching facial expressions to robots
4.- Motion templates for mapping human expressions to robot features
5.- Domain-agnostic representation of movement patterns across different subjects
6.- Analysis of video game speedruns to infer behavioral patterns
7.- Motion template generation from unseen gaming environments
8.- Model's unexpected success in segmenting animals from non-gaming content
9.- Multiple feature prediction instead of single-mode predictions
10.- Clustering future predictions to identify shared action representations
11.- ILPO: Learning latent actions from video transitions
12.- Generative modeling to predict possible next states
13.- Policy learning through weighting different possible future states
14.- Quick adaptation of learned policies to real environments
15.- Learning optimal value functions from suboptimal video demonstrations
16.- State-to-state transitions versus traditional state-action pairs
17.- Learning without reward functions using video sequence ordering
18.- Value function derivation from end-of-video rewards
19.- Training reinforcement learning agents using learned video values
20.- Genie: Creating interactive environments from video data
21.- Video tokenization using discretized VQ-VAE model
22.- Latent action modeling for future state prediction
23.- Dynamic modeling for environment interaction
24.- Text-generated images becoming interactive environments
25.- Application to platformer game environments
26.- Potential for training future AI agents
27.- Scaling to real-world videos through architecture expansion
28.- Handling reward hacking across multiple generated environments
29.- Benefits of action-free learning for diverse datasets
30.- Low-level representation emergence in environment structure
Knowledge Vault built byDavid Vivancos 2024