Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:
Resume:
1.- Vision and Language Navigation: Navigating an embodied agent in a 3D environment using natural language instructions.
2.- Cross-modal grounding: Grounding natural language instructions onto local visual scenes and global visual trajectories.
3.- Sparse reward issue: Success signal only given when the agent reaches the destination, ignoring instruction-following.
4.- Reinforced Cross-Modal Matching: Method to calculate extrinsic rewards based on history, instruction, and visual context.
5.- Matching critic: Evaluates the extent to which the original instruction can be reconstructed from the generated trajectory.
6.- Cycle reconstruction reward: Intrinsic reward used to train the navigator, encouraging instruction-following.
7.- Generalization issue: Models fail to generalize well to unseen environments.
8.- Self-Supervised Imitation Learning (SIL): Learning to explore unseen environments with self-supervision.
9.- Unlabeled instruction: Used in SIL to generate trajectories, which are evaluated by the matching critic.
10.- Replay buffer: Stores the best trajectories generated during SIL for the navigator to imitate.
11.- Learning from past good behaviors: SIL allows the model to approximate a better policy for new environments.
12.- Test time: Navigator performs one trajectory per instruction, making SIL helpful in practice.
13.- In-home robot: Example application where SIL can help the robot improve as it becomes familiar with the house.
14.- Example before and after SIL: Agent successfully follows instructions and reaches destination after exploring with SIL.
15.- Results on unseen test set: RCM model outperforms baseline speaker-follow model, and SIL significantly improves SPL score.
16.- Performance gap reduction: SIL helps reduce the performance gap between seen and unseen environments.
Knowledge Vault built byDavid Vivancos 2024