Day 4 - Florian Engert, Georg Keller, Valerio Mante - Fish o-turn escape system and mouse escape pilotage, sensory motor prediction in cortex and cortical dynamics during and after decisions

 Day 4 of CCNW2022



After a rainy night and waking to wet  gray skies, we started off with Melika telling us about the important meetings in the afternoon about progress on the 4 themes expressed as interests by the participants.

More self intros

  • Emre Neftci - Aachen - 6 words "online in situ learning needs your help" B and C
  • Eneo Ceolini  U Leiden "digital behavior reveals cognitive behavior"  B and C 
  • Valerio Mante INI "computation through dynamics" B and C

Florian: Fish o-turn escape system and mouse escape pilotage, 

Florian started off with the aim of the morning to consider innate goals and when behavior is goal directed vs reflex. First Inborn goals, then learned goals.

To get discussion going, Florian proposed a system, the "thermostat" as a basic system that all animals care about. Doesn't matter if artificial or natural or evolutionary but needs to work well and be implemented robustly to deal with delays and work with bang bang control of heating or cooling. The discussion then degenerated a bit to arguments about the definition of goal directed behavior... 


The seemingly simple example of a thermostat immediately brings up interesting control problems, since they must use bang bang control (of the heater) and the effect of the control has huge delay and depends on external influences like outside temperature. Also, the simple thermostat, like a robot vacuum cleaner, can switch goals (daytime and nighttime temperature or cleaning and recharging) and must plan ahead to meet these goals. The same is true for biological homeostatic mechanisms that use bang bang (not proportional) control like gene activation that have huge delays in their effect.

The aim of this long discussion apparently about semantics was to point out that feedback control is critical to understand neural processing. Any non-trival controller (i.e. not just PID) needs to build a model of the world so that it can infer or predict the consequences of its actions.

  1. having a set point or goal
  2. being able to switch behaviors or goals
These all have these above properties

  1. thermostat
  2. Roomba
  3. fish
Larval zebrafish are translucent baby fish that current technology allows recording nearly all the neurons in the brain. However, raw recording is nearly useless without a framework to understand how to look for correlates for goals and to try to see what are the circuits that might implement them.
In particular they have a super quick escape reflex, in 5ms they can execute a 180 degree turn at up to 20k deg/s. One predator is dragonfly larvae that have pincers that can strike, also in 5ms. Zebrafish can sense the vibration and escape. A drawing of this baby fish and the eyes and hair cells with the giant magno cells that are the remaining "command neurons". If only ONE of them fires a single spike, then the fish will make an "O turn" and swim away. All within 5ms. The inner ear hair cells are connected by one synapse to the magno cell.  Magno cells make exactly one, but very complex spike that look like climbing fiber spikes in cerebellum.




The fish is covered by hair cells all around its body that it uses to feel pressure in the surrounding water.

Given a nondirectional sound to the animal (e.g. tapping on the dish by an electromagnet or just finger), both side hair cells fire and both make excitatory synapse to magno command cells. But the cells inhibit each other so that both cannot fire at once.  A nondirectional sound randomly makes a left or right o-turn and escape.

If the fish crashes to an object at this high speed it is bad because it can damage the fish. But a rock or something like this that is hard and can damage the fish changes things: Now the fish only turns away from the barrier. In darkness they still crash, so it must be vision somehow is used to avoid the object. How could this work?

One last thing they tried was to figure out if the visual input was inhibitory or excitatory on the magno cells. By ablating one of the magno cells (there are actually 3 magno homologs on each side), they could see that the 50/50 random choice went to 66/33 biased turns. So, the visual input is excitatory to the opposite side to pre-bias to the opposite side.

The visual input encodes the size and distance of the barrier (somehow). Baby fish turn away from each other. 14-21 day old fish start clumping, for protection. There is a switch around 14 days. They are found to compute the number of conspecifics on each other and young ones turn away from the eye with more, and older turn towards. They integrate vertically and average horizontally.  Darker spots are counted. But size only works vertically since fish are long but not high.  So they seem to measure the vertical size of the dark stripes produced by conspecifics. In older schools, there turn towards the clutter, but only until the spots on one side do not get too big.

This very interesting description of the behavior of baby fish now switched to mice and Tiago Branco's story of exploration of an arena with holes in the ground that are covered with covers inset by 1cm which can be robotically opened or closed. One of them is open. The mouse can go in, explore and get food. If you give a looming stimulus, the mouse will run away for shelter.  It must run to the open hole without seeing it, because of the inset of the cover.  So it must keep track of the where is the open hole while foraging and exploring.



Florian believes this is implemented as in ant, with path integration system based on similar ring attractor of neurons representing heading and integrating distance. However there seems to be a rubber band that looming scary stimulus (like hawk predator) releases a motor program to hide in its hole.
The hiding behavior is triggered by one or more of a hundred RGCs per lower half of each eye that are specialized to detect looming e.g. from raptor or other predator.

If water spouts are added (there are four of them), and only one of them is active, then they can learn that a buzz means the water spout will be active for 5 seconds. Now the releasing behavior is much more leisurely; the mouse orients to the active water fountain when the buzz comes. So it is the same system but is voluntary. Markus Meister's lab have generated a mouse line without cortex or hippocampus but these mice still perform these behaviors.

Paul XXX from Berlin points out that you can take the mouse out of the cage for some time, and then put it back and it still remembers where is the escape hole, so it cannot be just some simple path integration by ring attractor. After all, mice have complex place and grid cell system, not just simpler ant-like path integrating ring attractor.

There are two ways to store information: Some attractor state, and by weight changes.

After the break we are promised  by Georg Keller to explore the ungodly mess in cortex.

Georg: Sensory motor prediction in cortex

Georg now promised to take the feedback idea notions of the early morning and promises to explain cortex.  We can reproduce vocal sounds and visual movements. How can we imitate these spatial temporal patterns that are only observed once or a few times. 

It turns out that leaning mapping from motor to sensing is much easier than finding mapping from sensing to motor, because you can learn from the mismatch between what the world does and what we expect it to do to improve our model, as illustrated in the sketch to right 




The trick from Jordan Rumelhart paper (1992) showed how these mappings can be learned by using this efferent copy mismatch trick. The map from motor to sensing is called a "forward model" and the mapping from sensing and motor is the "inverse model".  The copy of motor cortex output is a corollary discharge.

We can look in sensory cortical areas and see if we can find evidence for these mappings. The answer is yes, they are all over. Most people up to recently have assumed that XXX.  

The original view of Helmholtz was the "sense of innervation" observed in paralyzed eye muscle experiments where subjects report that the world moves when they try to move their eye.

Now, getting to how cortex works....
The idea is that we have a comparator and integrator as sketched to right



The comparator gives the error and the integrator sums it over time. The layers of cortex use L2/3 as comparator - there are cells there that subtract top down and bottom up. To subtract and thus compute prediction error is with inhibitory neuron to flip sign for signals that are fully rectified and there are analogs of ON and OFF systems that represent both signs of signals. The observed net response of L2/3 to drifting grating visual input is massive inhibition, because the L2/3 neurons stimulate large populations of inhibitory neurons. To the animal this input would come from a turn. But using a looming or naturally-generated gait flow, then there is excitation so the story is not entirely clear...

The discovered that the PLKD1 axon guidance molecule labels 50% of cells in cortex, and does not overlap with other markers.  There also cells that signal too little response and too much (XXX and XXX) positive and negative prediction errors.

The integrator is a L5 neuron that integrates over positive and negative prediction errors over time.

For this system to work the L5 integrator needs to act like an attractor that can hold state. 

Georg concluded that this view of cortical development is their working model of how cortex learns (during development) these sensory and motor mappings.

Valerio: Cortical dynamics during and after decisions

Finally Valerio Mante took over and spoke about goal directed behavior in the context of primates, where it is often associated with frontal cortex. There are interesting case studies where removing frontal cortex makes life short term.
One patient who lost frontal cortex and started shaving in bathroom, took off clothing in bedroom, and in general could not plan appropriate actions.
In general PFC (prefrontal cortex, the front of the brain) supplies working memory, abstraction, inhibitory control, planning. And it provides flexibility by learning and incorporating context.

There is interesting data from Henry Harlow reported in the book from Presslingen and Wise on PFC. 

There are "problems", each one ending after trial 5. In each problem, the objects 1 and 2 are new ones. But in each problem only one of the objects, e.g. A is rewarded. The first trial can say which one is rewarded.
  1. A B
  2. B A
  3. A B
  4. A B
  5. B A 

where only object 2 is rewarded and the finding is that that if you plot the accuracy of choice over trials but only after trial 2 it rapidly converges to 100% in humans rats cats etc. There are very clear differences between how fast species with different sized PFCs can learn to generalize quickly over these problems

What circuit could do this?












Machens et al (a favorite of Valerio's) did experiment with vibration on finger of higher or lower frequency presented sequentially in time as illustrated here



The monkey had to report one being higher than the other or vice versa, i.e. A1 or A2 scenario.

It requires 3 phases "loading" "memory" and "compare" phases. The proposed network illustrated on the right in the above illustration solves this.

In loading phase the net generates attractor during loading phase, which turns to line attractor and finally saddle point.

Nowadays, this same can be done with simple vanilla RNN trained with BPTT as illustrated below that implements the various attractor dynamics in different parts of its state space.



This experiment clearly shows there is a representation of the plan for action, and another memory with different encoding for the choice just taken. But even though they both relate to saccade to one direction, they have different activities and different decoders, so the plan is NOT the same as the memory of it.

If we look at the trajectories in state space



and single trials we can see how individual trials differ between trials compared to average, then there is very interesting DIVERGENCE over time from the average, representing a memory choice or decision evolving towards a stored action over time. I.e. it has the hallmarks of an attractor with  saddle point dynamics.  The time constants are rather large e.g. 500ms.




What is the mechanism? There was some discussion about how this is not like typical RNN but has dynamic stability?  Not clear.

After this long and intense session on brain mechanisms of goal seeking, we broke for lunch and the afternoon sessions on workgroup progress and discussion.

The evening concluded with a lively Magic Castle "Magician's Poker" where all cheating was allowed, as long as you are not caught by someone else with money in the pot


























Comments

Popular posts from this blog

Day 9 - Ben Grewe and Pau Aceituno - Cortical neaning rules and dynamical states

Day 2 - Friedemann Zenke, Matthew Cook on Neural Computation

Day 3 on Navigation: Barbara Webb, Julien Serres and Pavan Ramdya