The End Of Knowledge - Vault 2 - ICLR (2014-2023)

graph LR classDef mackay fill:#f9d4d4, font-weight:bold, font-size:14px; classDef machinelearning fill:#d4f9d4, font-weight:bold, font-size:14px; classDef gaussianprocesses fill:#d4d4f9, font-weight:bold, font-size:14px; classDef deeplearning fill:#f9f9d4, font-weight:bold, font-size:14px; classDef latentmodels fill:#f9d4f9, font-weight:bold, font-size:14px; classDef resources fill:#d4f9f9, font-weight:bold, font-size:14px; A[Neil Lawrence
ICLR 2016] --> B[Inspirational Mackay
passed from cancer. 1] B --> C[Mackay revolutionized
machine learning. 2] A --> D[Speaker: oil rigs to
PhD student. 3] C --> E[Mackay introduced priors
over NN weights. 4] C --> F[Gaussian processes solved
NN problems then. 5] A --> G[Data explosion advanced
deep learning rapidly. 6] A --> H[Gaussian processes: priors
over functions directly. 7] H --> I[Gaussian processes, NNs
connected under conditions. 8] H --> J[Gaussian processes excel
on small data. 9] H --> K[Gaussian processes model
malaria in Uganda. 10] H --> L[Gaussian processes infer
protein levels. 11] H --> M[Mackay: Gaussian processes
just smoothing machines? 12] A --> N[Deep learning composes
differentiable functions. 13] A --> O[Bayesian inference: priors,
posteriors, predictions. 14] A --> P[Variational inference approximates
intractable posteriors. 15] H --> Q[Gaussian process inference
hard, made tractable. 16] H --> R[Sparse approximations scale
Gaussian processes. 17] H --> S[Composing Gaussian processes
challenging, bounds enable. 18] S --> T[Deep Gaussian processes
compose with uncertainty. 19] T --> U[Deep Gaussian processes
avoid overfitting. 20] A --> V[Latent variable models
represent high-D observations. 21] V --> W[Mackay pioneered neural networks
for unsupervised latents. 22] V --> X[Gaussian process latents
extract low-D structure. 23] X --> Y[Layered Gaussian process
latents learn hierarchies. 24] Y --> Z[Company scaling layered
Gaussian process models. 25] Z --> AA[New approximations reduce
numerical issues scaling. 26] V --> AB[Goal: 'deep health'
personalized medicine models. 27] A --> AC[Resources available:
schools, tutorials, software. 28] A --> AD[Recent research: RNNs,
variational autoencoders. 29] A --> AE[Speaker inspired by
Mackay, laments loss. 30] class A,B,AE mackay; class C,D,E,F,N,O,P,W machinelearning; class G,H,I,J,K,L,M,Q,R,S,T,U gaussianprocesses; class V,X,Y,Z,AA,AB latentmodels; class AC resources; class AD deeplearning;

Resume:

1.-David Mackay was an inspirational figure who passed away from cancer at age 49, leaving behind a young family.

2.-Mackay revolutionized machine learning and information theory. A symposium was held before his death to honor his broad influence.

3.-The speaker worked on oil rigs implementing neural networks before becoming a PhD student. Neural networks are functions approximating weighted sums.

4.-Mackay introduced priors over weights in neural networks, turning them into classes of functions. Weight decay implements this idea.

5.-With limited data in that era, Gaussian processes seemed to solve many machine learning problems that neural networks aimed to address.

6.-Digital data explosion in areas like vision, speech, language allowed deep learning methods to advance rapidly and achieve impressive results.

7.-Gaussian processes take a different modeling approach - placing priors over functions directly. Covariance functions relate inputs to covariances.

8.-Gaussian processes and neural networks are connected - as hidden layers increase, neural nets converge to Gaussian processes under certain conditions.

9.-For small datasets, Gaussian processes often outperform other methods. They provide good uncertainty estimates for tasks like Bayesian optimization.

10.-Gaussian processes have been applied to model malaria spread in Uganda, inferring missing reports. Visualization is key for impact.

11.-Gaussian processes can infer unobserved protein levels in gene regulatory networks by placing priors on the dynamics as differential equations.

12.-Despite their power, Mackay noted Gaussian processes are just sophisticated smoothing machines, questioning if we "threw the baby out with the bathwater."

13.-Deep learning composes differentiable functions to learn representations. Propagating gradients through the composition is key to optimizing them.

14.-Bayesian inference involves specifying prior distributions, computing posterior distributions over parameters, and making predictions by marginalizing the posterior.

15.-Variational inference approximates intractable posteriors with simpler distributions, turning integration into optimization problems. It gives probabilistic neural network training.

16.-Gaussian process inference is hard due to priors on infinite-dimensional functions. Variational approximations and augmentation make it tractable.

17.-Sparse approximations allow Gaussian processes to scale to large datasets. Parameters increase to tighten a lower bound on the marginal likelihood.

18.-Composing Gaussian processes to make deep models is challenging due to intractability of the resulting integral. Variational bounds enable it.

19.-Deep Gaussian processes give a way to compose stochastic processes while maintaining uncertainty. Theory may help understand how deep learning works.

20.-On small datasets, deep Gaussian processes can avoid overfitting as much as shallow ones while increasing flexibility.

21.-Latent variable models represent high-dimensional observations through lower-dimensional unobserved variables. Motion capture data demonstrates this concept.

22.-Mackay pioneered using neural networks for unsupervised latent variable models through density networks, but limited data restricted their effectiveness then.

23.-Gaussian process latent variable models can extract meaningful low-dimensional structures and even infer the latent dimensionality needed from little data.

24.-Layered Gaussian process latent variable models applied to handwriting and motion capture aim to learn hierarchical, abstract representations.

25.-Scaling up layered Gaussian process models is a key challenge being addressed by forming a company to develop them further.

26.-New approximations from the company reduce numerical issues when scaling these models, showing promising early results over previous approaches.

27.-The ultimate goal is "deep health" - integrating all aspects of an individual's health data into comprehensive models for personalized medicine.

28.-Educational resources are available to learn more about Gaussian processes, including a summer school, tutorial, and open-source software.

29.-Recent research extends Gaussian processes to recurrent neural network architectures and introduces variational autoencoders with deep Gaussian process priors.

30.-The speaker attributes his research direction and inspiration to Mackay's influence, lamenting the loss of Mackay's ongoing presence for his family.

Knowledge Vault built byDavid Vivancos 2024