Day 2 - Yves Fregnec and Michael Berry

After some early morning swims and breakfast under sunny and calm morning skies (but with the threat of rain for the next 2 days looming) we started off with some more self intros

Florian "Fish feelings lovingly explained" but modified from 2018
Michael Schmucker "neuromorphic olfaction event based sensing XXX"
Georg Keller "XXX"
Yves Fregnec "Brains without synapses"

(The six words are seriously mis-transcribed above)

Florian Engert provocatively asserted that nearly all of so-called "Learning" is innate to the gene

Then Yves started off. His job he started off was to focus on the relevance of biological computation to artificial computing. In 1956 there was a famous meeting where the Minsky asserted that "every aspect of thinking can be simulated by computers" and that a summer undergrad research project would be enough to conquer computer vision.

XXX claimed that simple neurons that sum up input and threshold the resulting sum are sufficient to model the brain. Marr and Poggio pointed out the weakness of the previous approach that it cannot explain the graded computation on neurons or the complex graded logical operations that can occur in single neurons with their nonlinear dendrites and shunting input at various places in the apical and distal dendritic branches. These observations started a lively series of comments from the audience about the utility of this view of biological neurons and whether AI is missing them and if so, in what form.

Yves then provided the classic example of simple and complex visual receptive fields, with the simple one more like point neuron with ReLU activation and the complex cells more like a max pooling unit with ORing of independent subunits. But he further asserted that their careful measurements with noise input have shown convincingly that even simple cells have complex cell behavior with nonlinear summation of subunits.

Yves went on point out that LFP and BOLD measurements that reflect glial activity, i.e. power consumption of the brain are strongly correlated with the cell response properties like orientation selectivity. Such observations show strongly correlated activity in columns of neurons but do not inform at all about the individual neurons or their computations and circuits.

He then finished by pointing out that Shannon's view of additive channel noise is too simplistic to explain apparently noisy brain activity, for example, if neurons are stimulated with noisy input, they fire very reliably at peaks of their input, as opposed to the unpredictable timing of spikes in response to unnatural step input as illustrated in this sketch. Therefore noise in the brain is just they appearance of how it computes, not necessarily corruption of desired signal.

Michael Berry took over then to start talking about layered computation in cortex and predictive coding. It was pointed out that CO2 rose from 500ppm to 1500ppm in the room during this first session so doors should be opened when the lawnmower is not running.

1965 Barlow Center Surround receptive fields increase coding efficiency
1982 Srinivasn Laughlin Dubs 1st paper on predictive coding. Srinivasan was also very active in this view of computation insect brains
1999 Rao and Dana Ballard applied these ideas to cortex and showed that e.g. V2 could learn to predict V1.
There was later work from Karl Fristin and others
2017 Heeger had another model of predictive coding with inverted predictive coding. The prediction goes up, the error goes down.

What's wrong with classic version of predictive coding? The aim is to reduce redundancy, but

we need some redundancy because of noise
it is observed there are overlapping codes, e,g. in the many RGC types in the retina that present mulitiple views of the visual input to the LGN
unsupervised learning requires correlation, but simple predictive coding removes it.

In 2001 Barlow wrote a paper called "Redundancy reduction revisited" that made all these points.

Michael then presented his interesting model of cortical layering in this context. He sketched the circuit below.

In this circuit, L4 learns spatial clusters of features. L2/3 learns temporal sequences to clump over time. Time scale is a few response times. L6 learns to predict firing in L4, using STDP. Something makes L6 fire. If L6 unit happens to fire before L4, then it will strengthen its synapse and start to predict the L4 spikes.

Why do this?

This is exactly opposite from usual predictive coding, which tries to suppress redundancy

Anticipatory firing is useful
Predictive encodes context so it can resolve ambiguity
Prediction adds "certainty" so it can encode the reliability of its input, so inputs can be amplified in a resonant loop.
To bias L4 features for predictable features. Before prediction, L4 just learns statistics of LGN input, but the predictive input can bias the features towards temporally predictable features. there is a paper from Bialik and Tilley 2004. In this paper they observe a window of time and show that predicted information Ipr goes as log T and entropy H goes as T.

Mike concluded by described a series of experiments where they showed noise movie clips thousands of times to mouse and compared responses to clips shown in frame order versus random order. They observed a very interesting result that the activity of neurons was suppressed in trained order compared to untrained order. The reduction is only about 10% but the sparsification is very selective; only the neurons with lower activity are suppressed, leaving the strongly firing neurons.

The model that explains this behavior was sketched like this: