Neural Dynamics of Attentive Object Recognition, Scene Understanding, and Decision Making – Stephen Grossberg (Boston University)

March 31, 2009 all-day

View Seminar Video
This talk describes three recent models of how the brain visually understands the world. The models use hierarchical and parallel processes within and across the What and Where cortical streams to accumulate information that cannot in principle be fully computed at a single processing stage. The models hereby raise basic questions about the functional brain units that are selected by the evolutionary process, and challenge all models that use non-local information to explain vision.The ARTSCAN model (Fazl, Grossberg, & Mingolla, 2008, Cognitive Psychology) clarifies the following issues: What is an object? How does the brain learn to bind multiple views of an object into a view-invariant object category, during both unsupervised and supervised learning, while scanning its various parts with active eye movements? In particular, how does the brain avoid the problem of erroneously classifying views of different objects as belonging to a single object, and how does the brain direct the eyes to explore an object’s surface even before it has a concept of the object? How does the brain coordinate object and spatial attention during object learning and recognition? ARTSCAN proposes an answer to these questions by modeling interactions between cortical areas V1, V2, V3A, V4, ITp, ITa, PPC, LIP, and PFC.The ARTSCENE model (Grossberg & Huang, 2008, Journal of Vision) also uses attentional shrouds. It clarifies the following issues: How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? ARTSCENE classifies natural scene photographs better than competing models by using multiple spatial scales to efficiently accumulate evidence for gist and texture. The model can incrementally learn and rapidly predict scene identity by gist information alone (defining gist computatationally along the way), and then accumulate learned evidence from scenic textures to refine this hypothesis.The MODE model (Grossberg & Pilly, 2008, Vision Research) clarifies the following basic issue: How does the brain make decisions? Speed and accuracy of perceptual decisions covary with certainty in the input, and correlate with the rate of evidence accumulation in parietal and frontal cortical “decision neurons.” MODE models interactions within and between Retina/LGN and cortical areas V1, MT, MST, and LIP, gated by basal ganglia, to simulate dynamic properties of decision-making in response to ambiguous visual motion stimuli used by Newsome, Shadlen, and colleagues in their neurophysiological experiments. The model shows how the brain can carry out probabilistic decisions without using Bayesian mechanisms.

Center for Language and Speech Processing