A theory of the primary visual cortex (V1): Predictions, experimental tests, and implications for future research

L Zhaoping

University College London, London, United Kingdom

Since Hubel and Wiesel's venerable studies, more is known about the physiology of V1 than other areas in visual cortex. However, its function has been seen merely as extracting primitive image features to service more important functions of higher visual areas such as object recognition. A decade ago, a different function of V1 was hypothesized: creating a bottom-up saliency map which exogenously guides an attentional processing spotlight to a tiny fraction of visual input (Li, 2002, Trends in Cognitive Science, 6(1):9-16). This theory holds that the bottom-up saliency of any visual location in a given scene is signaled by the highest V1 neural response to this location, regardless of the feature preferences of the neurons concerned. Intra-cortical interactions between neighboring V1 neurons serve to transform visual inputs to neural responses that signal the saliency. In particular, iso-feature suppression between neighboring V1 neurons tuned to similar visual features, such as orientation or color, reduces V1 responses to an iso-feature background, thereby highlighting the relatively unsuppressed response to an unique feature singleton. Superior colliculus, receiving inputs directly from V1, likely reads out the V1 saliency map to execute attentional selection. Several non-trivial predictions from this V1 theory have subsequently been confirmed. The most surprising one states that an ocular singleton --- an item uniquely presented to one eye among items presented to the other eye --- should capture attention (Zhaoping, 2008, Journal of Vision, 8/5/1). This attentional capture is stronger than that of a perceptually distinct orientation singleton. It is a hallmark of V1, since the eye of origin of visual input is barely encoded in cortical areas beyond V1, and indeed it is nearly impossible for observers to recognize an input based on its eye of origin. Another distinctive prediction is quantitative, yet parameter-free (Zhaoping and Zhe, 2012, Journal of Vision, 12(9):1160). It concerns reaction times for finding a single bar with unique features (in color, orientation, and/or motion direction) in a field of other bars that are all the same. Reaction times are shorter when the unique target bar differs from the background bars by more features; the theory predicts exactly how much. Behavioural data (collected by Koene and Zhaoping 2007, Journal of Vision, 7/7/6) confirms this prediction. The prediction depends on there being only few neurons tuned to all the three features, a restriction that is true of V1, but not of extra-striate areas. This suggests that the latter play little role in exogenous saliency of at least feature singletons. Exogenous selection is faster and often more potent than endogenous selection, and together they admit only a tiny fraction of sensory information through an attentional bottleneck. V1's role in exogeneous selection suggests that extra-striate areas might be better understood in terms of computations in light of the exogenous selection, and these computations include endogenous selection and post selectional visual inference. Furthermore, visual bottom-up saliency signals found in frontal and parietal cortical areas should be inherited from V1.

Up Home