We review and apply a computational theory based on the hypothesis that the feedforward path of the ventral stream in visual cortex's main function is the encoding of invariant representations of images. A key justification of the theory is provided by a result linking invariant representations to small sample complexity for image recognition - that is, invariant representations allow learning from very few labeled examples. The theory characterizes how an algorithm that can be implemented by a set of "simple" and "complex" cells - a "Hubel Wiesel module" - provides invariant and selective representations. The invariance can be learned in an unsupervised way from observed transformations. Our results show that an invariant representation implies several properties of the ventral stream organization, including the emergence of Gabor receptive filelds and specialized areas. The theory requires two stages of processing: the first, consisting of retinotopic visual areas such as V1, V2 and V4 with generic neuronal tuning, leads to representations that are invariant to translation and scaling; the second, consisting of modules in IT (Inferior Temporal cortex), with class- and object-specific tuning, provides a representation for recognition with approximate invariance to class specific transformations, such as pose (of a body, of a face) and expression. In summary, our theory is that the ventral stream's main function is to implement the unsupervised learning of "good" representations that reduce the sample complexity of the final supervised learning stage.

Representation Learning in Sensory Cortex: A Theory

Anselmi, F
;
2022-01-01

Abstract

We review and apply a computational theory based on the hypothesis that the feedforward path of the ventral stream in visual cortex's main function is the encoding of invariant representations of images. A key justification of the theory is provided by a result linking invariant representations to small sample complexity for image recognition - that is, invariant representations allow learning from very few labeled examples. The theory characterizes how an algorithm that can be implemented by a set of "simple" and "complex" cells - a "Hubel Wiesel module" - provides invariant and selective representations. The invariance can be learned in an unsupervised way from observed transformations. Our results show that an invariant representation implies several properties of the ventral stream organization, including the emergence of Gabor receptive filelds and specialized areas. The theory requires two stages of processing: the first, consisting of retinotopic visual areas such as V1, V2 and V4 with generic neuronal tuning, leads to representations that are invariant to translation and scaling; the second, consisting of modules in IT (Inferior Temporal cortex), with class- and object-specific tuning, provides a representation for recognition with approximate invariance to class specific transformations, such as pose (of a body, of a face) and expression. In summary, our theory is that the ventral stream's main function is to implement the unsupervised learning of "good" representations that reduce the sample complexity of the final supervised learning stage.
File in questo prodotto:
File Dimensione Formato  
Representation_Learning_in_Sensory_Cortex_A_Theory.pdf

accesso aperto

Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 6.47 MB
Formato Adobe PDF
6.47 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3035078
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact