Knowledge Vault 2/46 - ICLR 2014-2023
Blake Richards ICLR 2018 - Invited Talk - Deep Learning with Ensembles of Neocortical Microcircuits
<Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:

graph LR classDef machinelearning fill:#f9d4d4, font-weight:bold, font-size:14px; classDef neuralnetworks fill:#d4f9d4, font-weight:bold, font-size:14px; classDef ensemblemultiplexing fill:#d4d4f9, font-weight:bold, font-size:14px; classDef neuroscience fill:#f9f9d4, font-weight:bold, font-size:14px; classDef modeling fill:#f9d4f9, font-weight:bold, font-size:14px; A[Blake Richards
ICLR 2018] --> B[ML interprets brain,
neuroscience inspires ML 1] A --> C[Lab: deep learning
in brain dendrites 2] C --> D[Ensemble multiplexing,
credit assignment solution 3] C --> E[Neural network units:
multiple state levels? 4] E --> F[Enable dynamic routing
in neural networks 5] A --> G[Mammalian cortex: multi-layer
convolutional network analog 6] A --> H[True deep learning:
end-to-end training 7] A --> I[Brain's credit assignment
solution unclear 8] A --> J[Pyramidal neurons:
complex dendrite structure 9] J --> K[Apical dendrites: nonlinear
potentials alter spiking 10] J --> L[Built multi-layer network
simulating pyramidal neurons 11] L --> M[Fixed random feedback works:
'feedback alignment' 12] L --> N[Network learned representations
shaping layers 13] J --> O[Apical dendrites drive
burst firing 14] O --> P[Ensemble multiplexing:
simultaneous signaling 15] P --> Q[Event rate: bottom-up,
burst probability: top-down 16] P --> R[Interneurons inhibit apical dendrites:
credit assignment 17] P --> S[Ensemble model incorporated microcircuit 18] S --> T[Burst probability enables
credit assignment 19] S --> U[Recurrent inhibition prevents
vanishing gradients 20] S --> V[Ensemble model performs well,
more robust 21] A --> W[Neocortical microcircuits likely
use ensemble multiplexing 22] A --> X[Neural network units:
neuron groups 23] X --> Y[Neuron groups: shared
connectivity, properties 24] X --> Z[Ensemble modeling: mesoscopic
and microscopic states 25] Z --> AA[Multiple microstates correspond
to same mesostate 26] Z --> AB[Mesoscopic weights: function
of microscopic weights 27] AB --> AC[Microscopic weights fixed,
mesoscopic weights change 28] AB --> AD[Mesoscopic weights: presynaptic spiking,
postsynaptic receptivity, microscopic weights 29] A --> AE[Neuroscience-inspired approach: context-dependent
computation, routing in ANNs 30] class B,C machinelearning; class D,E,F,P,Q,R,S,T,U,V,W ensemblemultiplexing; class G,H,I,J,K,L,M,N,O neuralnetworks; class X,Y,Z,AA,AB,AC,AD,AE modeling;

Resume:

1.-Machine learning helps neuroscientists interpret brain data, while neuroscience provides structural priors and inspiration for novel machine learning approaches.

2.-The speaker's lab examined how deep learning could work in the brain using specific dendrites.

3.-This led to the concept of ensemble multiplexing and a potential solution for how the brain does credit assignment.

4.-The work points to whether neural network units should have states at multiple levels (meso and micro states).

5.-This could enable dynamic routing in neural networks.

6.-Visual processing in mammalian cortex proceeds in stages analogous to a multi-layer convolutional neural network.

7.-True deep learning requires end-to-end training to ensure changes help achieve the learning objective.

8.-The brain's solution to credit assignment is unclear since backpropagation seems biologically unrealistic.

9.-Neocortical pyramidal neurons have a complex tree-like structure with basal and apical dendrites that receive different inputs.

10.-Apical dendrites are electrotonically distant from the soma and can generate nonlinear plateau potentials that alter somatic spiking.

11.-A multi-layer neural network was built simulating pyramidal neurons with segregated dendrites for credit assignment.

12.-Fixed random feedback weights work due to the "feedback alignment" effect described by Lillicrap et al.

13.-The network learned representations that shaped early layers to help later layers categorize MNIST digits.

14.-Plateau potentials in apical dendrites are good at driving burst firing, a potentially distinct signal for top-down information.

15.-Ensemble multiplexing allows simultaneous top-down and bottom-up signaling by treating bursts and spikes as events.

16.-Event rate tracks bottom-up somatic input while burst probability tracks top-down apical input.

17.-Somatostatin-positive interneurons inhibit apical dendrites of local pyramidal cells they receive input from, forming a credit assignment microcircuit.

18.-An ensemble model incorporated this microcircuit, with apical voltage being a function of top-down burst rates and recurrent inhibition.

19.-Burst probability is the sigmoid of the apical voltage, enabling credit assignment without pausing computation.

20.-Recurrent inhibition is trained to keep apical voltage near zero, preventing vanishing gradients and enabling more layers.

21.-The ensemble model performs as well as backpropagation on MNIST and is more robust to poor weight initializations.

22.-Neocortical microcircuits likely use ensemble multiplexing and inhibition for credit assignment and preventing vanishing gradients.

23.-Units in neural networks may be better modeled as groups of neurons rather than single neurons.

24.-Biological evidence supports neuron groups derived from the same progenitor cell having shared connectivity and functional properties.

25.-Modeling units as neuron ensembles requires differentiating between mesoscopic (ensemble-level) and microscopic (individual neuron) states.

26.-Multiple microstates (specific spiking/bursting neurons) can correspond to the same mesostate (overall event/burst rates).

27.-Mesoscopic weights between units would be a function of the microscopic weights between active neurons in the ensembles.

28.-Microscopic weights can remain fixed while mesoscopic weights change based on microstates, enabling dynamic information routing.

29.-Mesoscopic weights can be modeled as a product of presynaptic spiking, postsynaptic input receptivity, and the microscopic weights.

30.-While still being developed, this neuroscience-inspired approach could provide powerful context-dependent computation and routing in artificial neural networks.

Knowledge Vault built byDavid Vivancos 2024