Day 5 - Classification & Clustering

Chairs: Chiara and Germaine

Speakers: Thomas Nowotny, Simon Thorpe (Single spike), Giacomo Vale, Enea 

Self intros:

  • Giacomo Vale, neuroprosthetics, 6ws?, Fields A, D
  • Damien, memory technologies in neurosystems, 6ws? 
  • Neuromorphic hardware is very very cool. A, B
  • Fabian LeVarra, I am very happy to be here. 
  • Moritz, member of scientific committee, It's ok to sometimes miss spikes. learning and network dynamics

What is classification and clustering (Thomas Nowotny)?


Shoot away

  • - sorting
  • - what is similar?
  • - adding labels to your clusters?
  • - feature extraction
  • - dimensionality reduction
  • - correlate and decorrelate
  • - context understanding
  • - use for control, unlocking phone
  • - learning representations
  • - images are not the only data available
  • - language?
  • - methods: deep learning, Hebbian plasticity, tracking objects, metrics, prediction, competition, temporal stability, generative models


Sorting into a framework

  • - make sense of some input
  • - e.g. input: images, sense: labels (classification) or commonalities (clustering)
  • - different techniques, for classification there is external information, for clustering the structure is in the input itself
  • - in the middle there is an algorithm
  • - inspiration from the brain or inspiration from machine learning? let's look at deep learning

Deep learning for classification: Deep learning describes networks with multiple layers. Inputs, weights, activation of units are real numbers that are processed in a structured fashion (linear summation, non-linear activation, repeat per layer). Output units are then put in correspondence to the labels. Training works via adjusting weights via moving along the gradient of some loss function (difference in current network output versus expected output). One way of implementing this gradient descent via backpropagation (chain rule for the derivatives over layers). This leads to an algorithm for classification. 

Such a real number deep network can also represent a spiking neural network. Differences are that the units become stateful (membrane voltage dynamics) and handling the spike (reset appears non-differentiable). Handling the spike can be done in different ways: surrogate gradient descent (Friedemann Zenke), take spike times instead of membrane potential (Julian and Laura), formulation of the loss function via integral over time (event prop, implicit function theorem). 

Another difference is input encoding. Assume that in nature, input starts as a continuous real value and time can be discretized into N steps. How to represent this as spikes?

  • - rate: dynamical range [0,N], smooth changes
  • - timing: [0,N], smooth changes
  • - binary: [0, 2^(N-1)], unsmooth changes (could be solved via "gray codes")

*Q*:

  • - can one fit classification and clustering into the predictive processing framework?
  • - differentiate between algorithm and computational substrate? 
  • - why using backprop for spiking neural nets if traditional continuous deep networks work so well? what is the alternative at the moment? Hypothesis: As gradients calculated in this way are (mostly) approximate it will never be as good?


Spiking neural nets are not just a poor man's version of deep learning networks? (Simon)

* Arguments

  •     - biology has to solve the problem of fast transfer across different brain areas: idea of a pure feed forward network where only one spike per neuron is emitted
  •     - timing: show two pictures (one with an animal, one without), subject has to saccade to the picture with the animal, eyes move within 100-120 ms, even before EEG signature of the image appears, new version show complex scene with a face pasted, eyes saccade to the face within 100 ms
  •     - secret: V1 directly to L5 for a face template
  •     - 3x3 matrix with either vertical or horizontal orientation: rank-order coding (1998), problematic because orders grow very fast with the number of channels and number of spikes has to be constrained to avoid activation of other orders
  •     - if you have a high number of channels (neurons), the first spike carries a lot of information (N of M spikes coding)
  •     - STDP for clustering cars in images, also works with binary N of M coding 

Neurotech platform (Elisa)

https://neurotech.ai.eu  is a platform for the neuromorphic community

  •     - has a lot of educational material, webinars and tutorials
  •     - a vision/roadmap for neuromorphic computing
  •     - resources like datasets and references (recommended by players in the field) 
  •     - adjacent fields/communities like memristor or computational neuroscience
  •     - scholarships for attendance of Telluride and CNNW
  •     - platform in development, very open for feedback

    

Neuroprosthetics (Giacomo Vale)

  • - setting in the closed-loop between external world and sensory processing/motor excecution, one pathway is dysfunctional (amputation, blindness)
  • - aim: to replace the dysfunctional pathway via technology

Example: replace sensory systems, encode sensory input into electrical stimulation, which then stimulates neural activity, e.g. activate peripheral nerves via an electrode. Available pulse parameters: amplitude, channel, phase, timing, frequency, width

How does nerve stimulation work? A nerve consists of a large number of collection of nerve fibres (fascicles). This could be either adressed via an electrode wrapped around the nerve fibre or via a penetrating electrode. Requirements (for somatosensory):

  • - short delays (< 100 ms), interesting fact that direct stimulation on the finger produces an earlier response than stimulation in the brain
  • - has to be intensity-modulated that is if the patient increases pressing two fingers, the perceived pressure should increase, possibilities are rate coding (fibres fire faster e.g. via higher pulse frequency) or population coding (number of recruited fibres via higher pulse amplitude), such a mapping can be tested via psychophysical experiments, which finds that perception versus amplitude coding has a linear dependence in contrast to a log-dependence on frequency , questions about variability and stability of this mapping 
  • - modality of sensory input: sensors are often pressure sensors, cool idea, could allow to give patients supranatural resolution, missing proprio-reception (feeling own muscular activation) or temperature (because so far no temperature sensation could be elicited via electrical stimulation)
  • - space: has to represent exactly the space of the sensory input e.g. index finger (can be selected via the channel)

So far this map has to be explored via measurement. 

Quantifying [human] behaviour (Enea)

A very important part is the use of technology: "digital behavior", encompasses many parts of life and a good part of our day

How to quantify "digital behavior": a minimal quantification are the time steps of touching the smart phone, this event-based quantification has very different time scales (typing, waiting for a reply, sleeping)

- how to analyze such event-based touching my smartphone data, for example look at inter tap intervals and their distribution (power law, a lot of short intervals and very few long ones), tapping speed is correlated with reaction times, useful because tapping speed can be monitored the whole time in contrast to reaction experiments

- extract features that could be used to predict for example age or even brain states

- seeing the single user as a neuron with spikes emitted for each tap, what about networks of these smartphone users. 

 


Comments

Popular posts from this blog

Day 9 - Ben Grewe and Pau Aceituno - Cortical neaning rules and dynamical states

Day 2 - Friedemann Zenke, Matthew Cook on Neural Computation

Day 3 on Navigation: Barbara Webb, Julien Serres and Pavan Ramdya