Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:
Resume:
1.-The future of machine learning and AI is self-supervised learning, which involves learning dependencies between variables and filling in blanks.
2.-Self-supervised learning may enable machines to learn quickly with little supervision or interaction, similar to how babies learn basic concepts.
3.-The main challenges in AI are reducing supervision requirements, learning to reason beyond fixed steps, and learning to plan complex actions.
4.-Self-supervised learning involves predicting missing or future information from known information. Predictions must allow for multiple possibilities.
5.-Energy-based models can handle uncertainty by measuring compatibility between observed and predicted variables without requiring probabilities.
6.-Energy-based models can be trained using contrastive methods that push energy down on data points and up elsewhere.
7.-Probabilistic methods estimating densities are problematic as they create narrow canyons in the energy function that aren't useful for inference.
8.-Contrastive objective functions push down energy of data points and up on contrasting points with some margin.
9.-Self-supervised learning methods like BERT have been very successful in NLP but not as much for images.
10.-Contrastive embedding methods for images are computationally expensive as there are many ways for images to be different.
11.-GANs can be interpreted as contrastive energy-based methods that shape the energy function.
12.-Regularized latent variable methods limit information capacity to regularize volume of low energy space, as in sparse coding.
13.-Variational autoencoders are regularized latent variable energy-based models that add noise to the latent code to limit information.
14.-Graph-based and temporal continuity regularization can yield good representations by exploiting similarity structure or temporal predictability.
15.-Conditional versions of regularized latent variable models enable learning to predict multi-modal futures, as in vehicle trajectory prediction.
16.-Self-supervised learning is the best current approach for common sense learning in AI. Scaling supervised/reinforcement learning is insufficient.
17.-System 1 tasks are fast, intuitive, implicit and where current deep learning excels. System 2 tasks are slow, sequential, explicit.
18.-Extending deep learning to system 2 tasks can enable reasoning, planning, and systematic generalization through recombining semantic concepts.
19.-Joint distribution of semantic variables has sparse graphical model structure. Variables often relate to causality, agents, intentions, actions, objects.
20.-Simple relationship exists between semantic variables and language. Pieces of knowledge as rules can be reused across instances.
21.-Changes in distribution of semantic variables are local, e.g. due to causal interventions, with rest of model unchanged.
22.-Systematic generalization involves dynamically recombining concepts to explain novel observations, improving over current deep learning's lack of distribution shift robustness.
23.-Goal is to combine advantages of deep learning (grounded representations, distributed symbols, uncertainty handling) with symbolic AI's systematic generalization.
24.-Sequential conscious processing focuses attention on subsets of information which are broadcast and stored to condition subsequent processing.
25.-Language understanding requires combining system 1 perceptual knowledge with system 2 semantic knowledge in a grounded way.
26.-"Consciousness prior" posits sparse dependencies between semantic variables, enabling strong predictions from few variables as in language.
27.-Under localized distributional change hypothesis, changes in abstract semantic space are localized, enabling faster adaptation and meta-learning.
28.-Empirical results show learning speed can uncover causal graph structure. Parametrizing graphs by edges enables causal discovery.
29.-Recurrent independent mechanisms architecture with attention between modules improves out-of-distribution generalization by dynamically recombining stable modules.
30.-Core ideas are decomposing knowledge into recombinable pieces with sparse dependencies, and local changes in distribution enabling fast learning/inference.
Knowledge Vault built byDavid Vivancos 2024