Workshop on
Natural Environments Tasks and Intelligence


Mitra Hartmann (Friday, 9:40 am)

Vibrissal dynamics and the natural tactile exploratory behavior of the rat

Northwestern University

Rats rhythmically brush and tap their large mystacial vibrissae (whiskers) against objects to tactually explore their surroundings. Using mechanical signals from its vibrissae, a rat can determine an object’s size, shape, orientation, and texture. Our laboratory uses the rat vibrissal system as a model to understand how sensory and motor signals are integrated during tactile exploratory behavior. In this talk I will describe our laboratory’s recent advances in quantifying the complete mechanosensory input to the rat vibrissal array during natural exploratory behaviors and discuss implications of these results for neural processing. I will specifically focus on our laboratory’s efforts to develop a simulation environment that permits full dynamical simulations of vibrissal-object contact. We aim to integrate realistic vibrissal dynamics with behaviorally- measured head and vibrissal kinematics to model the rat's sampling strategies for various objects in the environment. Ultimately, the simulation system will be used to predict the mechanics at each vibrissa base for a given exploratory sequence, and thus predict the input to the brain.


Jose Pena (Friday, 10:20 am)

The biased owl

Albert Einstein Medical School

The owl captures prey using sound localization. While owls accurately localize sources near the center of gaze, they systematically underestimate peripheral source directions. This behavior can be predicted by a model that emphasizes central directions. We have proposed that there is a bias in the neural coding of auditory space, which, at the expense of inducing errors in the periphery, achieves high behavioral accuracy at the ethologically relevant range. To study the neural coding underlying behavioral biases we next focused on the physical arrangement of spatial receptive fields. The owl’s inferior colliculus (IC) contains a map of auditory space where frontal space is overrepresented. We asked whether this arrangement could bias local processing. We measured spatiotemporal receptive fields of IC neurons. We found a population-wide bias in surround suppression such that suppression by sounds coming from frontal space was stronger. The bias can be explained by a model of lateral interaction in a map where frontal space is overrepresented. Thus, the uneven distribution of spatial tuning within the map could explain the topography of time-dependent tuning properties. These mechanisms may have significant implications for the analysis of natural scenes by sensory systems.


Sarah Woolley (Friday, 11:30 am)

Roles of species and experience in the auditory coding of communication vocalizations

Columbia University

Communication is a strong selective pressure on brain evolution because the exchange of information between individuals is crucial for fitness-related behaviors, such as mating. Given the importance of communication, the brains of signal senders and receivers are likely to be functionally coordinated. We study vocal behavior and auditory processing in multiple songbird species with the goal of understanding how species identity and early experience interact to shape the neural systems that subserve communication. Male finches learn to produce species-specific songs, and both sexes learn to recognize songs. Our studies use single unit electrophysiology and measures of perception to examine how the coding properties of auditory neurons and behavior relate to the acoustics of species-specific vocalizations. We use cross-tutoring within and between species to manipulate auditory experience and vocal learning and therefore to test the roles of species-identity and experience in the development of auditory processing and perception of communication vocalizations. I will describe our experiments showing how species identity and early experience interact to match auditory function with vocal acoustics.


Helge Ritter (Friday, 1:30 pm)

From manual skills to cognitive interaction

Bielefeld University

Elucidating the structure of interactions has driven the success story of physics in the last century. How much can we gain from a similar focus on the analysis of cognitive interaction for a deeper understanding of cognition? We present an overview of the CITEC approach motivated by this question and leveraging methods from a multitude of fields ranging from robotics to neuroscience in order to map cognitive processes from an variety of angles and replicate them to different degrees of approximation in robot systems as a major benchmark. We then focus on manual skills as a major subtopic. Presenting examples and results from our current research, such as approaches to robust grasping and manipulation, the role of haptics and the learning of grasp manifolds, we show how the “manual intelligence" exhibited in our skills arises from the integration of a substantial set of elements ranging from low level control to high level cognitive planning. We conclude with speculating on the evolution and role of “manual intelligence" and argue that it can be in the role of a Rosetta stone leading to closer understanding of cognition and cognitive interaction.


Dana Ballard (Friday, 2:10 pm)

A new look at human motor control

The University of Texas at Austin

Much research has tackled motor control with the tools of classical mathematical optimization theory widely used in robotic models. Although dynamical systems can be modeled with classical Newtonian equations, for mammalian systems with very high numbers of degrees of freedom, these equations prove prohibitively expensive to solve except in the case of small subsystems. Nonetheless, humans themselves are an existence proof that some kind of practical solution must exist, since they have exquisite motor coordination. At the same time, since their physical properties are so distinctive, one must be prepared that the human solutions might look very different than those dictated by the robotics-inspired classical approach. A major point of departure is in the musculoskeletal design. Anthropomorphic bipedal designs can successfully walk downhill in a completely passive mode, implying that much of the machinery needed for human movements has been incorporated in the mechanical design itself. Further, passive muscle synergies augment basic mechanical degrees of freedom. By using activation to direct set contractions, the muscles can guide the complete motor system through a series of posture changes that are designed to both respect the basic joint limitations of the mechanical system and at the same time be efficient. The idea of movement control as a series of postural changes has long been espoused, but has so far resisted a tight mathematical formulation. Our research shows how to reduce elaborate posture changes into a compact code that has several advantages. First of all, it provides an overview of how the elaborate computations in abstract motor control could be parcellated into the brain’s primary subsystems. Secondly, its parametric description could be used in the extension of learned movements to similar movements with different goals. Thirdly, the sensitivity of the parameters can allow the differentiation of very subtle variations in movement.


Robbie Jacobs (Friday, 3:20 pm)

Are people optimal at making sequences of actions?

University of Rochester

In many real-world tasks, people need to make sequences of actions to achieve their goals. Here we seek to understand these sequences as "optimal" adaptations to task demands. Project 1 examines how we coordinate eye and hand control during the course of visually-guided reaching movements. We report an experiment that manipulated the demands that a task placed on the motor and visual systems, and then examined in detail the resulting changes in visuomotor coordination. We develop an ideal actor model that predicts the optimal coordination of vision and motor control in our task. On the basis of the predictions of our model, we demonstrate that human performance in our experiment reflects an adaptive response to the varying costs imposed by our experimental manipulations. Project 2 reports the results of an experiment in which human subjects were trained to perform a perceptual matching task. Subjects were asked to manipulate comparison objects until they matched target objects using the fewest manipulations possible. We use two benchmarks to evaluate the quality of subjects‚ learning. One benchmark is based on optimal performance as calculated by a dynamic programming procedure. The other is based on an adaptive computational agent that uses a reinforcement-learning method known as Q-learning to learn to perform the task. Our analyses suggest that subjects learned to perform the perceptual matching task in a near-optimal manner (i.e., using a small number of manipulations) at the end of training. Subjects were able to achieve near-optimal performance because they learned, at least partially, the causal structure underlying the task. In addition, subjects‚ performances were broadly consistent with those of model-based reinforcement-learning agents that built and used internal models of how their actions influenced the external environment. Project 1 was conducted in collaboration with Chris Sims and Dave Knill. Project 2 was conducted in collaboration with Reiko Yakushijin.


Mary Hayhoe (Friday, 4:00 pm)

Internal models for predictive saccades in natural interceptive tasks

The University of Texas at Austin

In the natural world, the brain must handle inherent delays in visual processing. This is particularly problematic in tasks such as intercepting rapidly moving objects. One compensatory strategy is to use prior experience together with current sensory data to predict future visual state, so that actions can be planned ahead of time. However the role of such prediction in controlling interceptive movements is not well established. There is substantial evidence for prediction in the control of eye movements, although the basis for such prediction and its potential role in interception are unclear. We have developed a virtual racquetball environment and have examined predictive saccades while intercepting balls. We find that predictive saccades are a pervasive aspect of performance. Subjects target a point on the trajectory of the ball, where it will pass subsequent to a bounce, with high accuracy, and compensate for variations in trajectory, velocity, and elasticity. This suggests that subjects use learnt knowledge of ball dynamics to predict where the ball will be after the bounce, and when it will get there. The complexity of the prediction rules out simple models such as visual interpolation or extrapolation. Instead, elasticity, 3D velocity, and gravity all are likely to be taken into account. Since eye, head, arm, and body movements all co-occur, it seems likely that a common internal model of predicted visual state is shared by different effectors to coordinate interceptive movements.


Vijay Balsubramanian (Saturday, 9:00 am)

Local statistics in natural scenes predict the saliency of synthetic textures

University of Pennsylvania

The visual system is challenged with extracting and representing behaviorally relevant information contained in natural inputs of great complexity. This task begins in the sensory periphery: retinal receptive fields and circuits are matched to the first and second-order statistical structure of natural inputs. This matching enables the retina to remove stimulus components that are predictable (and therefore uninformative), and primarily transmit what is unpredictable (and therefore informative). I will present evidence that this design principle also applies to more complex aspects of natural scenes, and to central visual processing. I will do this by classifying high-order statistics of natural scenes according to whether they are uninformative vs. informative. We find that the uninformative statistics are perceptually nonsalient, while the informative ones are highly salient, and correspond to previously identified perceptual mechanisms whose neural basis is likely central. Our results suggest that the principle of efficient coding not only accounts for filtering operations in the sensory periphery, but also shapes subsequent stages of sensory processing that are sensitive to high-order image statistics.


Karl Gegenfurtner (Saturday, 9:40 am)

Where we look determines what we see: Effects of fixation position on lightness perception

Giessen University

The variable resolution of the human visual system requires us to sample the world with eye movements. Here we show that where we look can have massive effects on perception. When observers matched the color of natural objects they based their judgments on the brightest parts of the objects, and at the same time they tended to fixate points with above-average luminance. When we forced participants to fixate a specific point on the object using a gaze-contingent display setup, the matched lightness was higher when observers fixated bright regions. This indicates as causal link between the luminance of the fixated region and the lightness match for the whole object. Simulations with rendered physical lighting show that this fixation strategy is an efficient and simple heuristic for the visual system to arrive at accurate and invariant judgments of lightness.


Mike Landy (Saturday, 10:50 am)

Visual coding of local orientation

New York University

For more than two decades, both behavioral and neurophysiological studies of the perception of "second-order" or texture patterns have centered on a simple model of texture segregation. In this model, sometimes called the FRF (for filter-rectify-filter model), borders between abutting textures are (1) emphasized by a linear filter tuned preferentially to one of the constituent textures; (2) rectified so that the high variance in response to the preferred texture is transformed into a higher mean response; and (3) filtered again at a coarser scale, leading to a strong response to the texture-defined edge. I will review three recent results that require extensions to this basic model for, in particular, the encoding of orientation-defined texture. First, I will describe behavioral experiments in which observers discriminate the mean orientation of two textures. The results are consistent with the behavior of a Bayesian observer who uses an orientation prior similar to the distribution of orientations in a database of real-world images. Second, there is substantial evidence for normalization processes in early visual coding, in which the responses of a neuron are normalized by the pooled responses of neighboring neurons. I will describe an experiment that indicates the presence of normalization at the second stage, thus leading to a cascade model of cortical processing (FRNFRN, where "N" stands for normalization). Finally, I will describe recent experiments designed to measure the tuning properties of the second-stage filter using critical- band masking as well as simultaneous detection and identification paradigms, and their implications for models of texture-modulation detection.


Jack Gallant (Saturday, 11:30 am)

Attention changes the cortical representation of object categories

University of California, Berkeley

Humans can recognize thousands of distinct object and action categories. However, previous studies have examined only a few specific categories, such as faces and places, so relatively little is known about how semantic information is generally organized across the human neocortex. To address this issue we used fMRI to record whole-brain activity evoked by natural movies. We then used regularized regression to model how each cortical voxel was tuned for hundreds distinct object and action categories. Analysis of these voxel-wise semantic models shows that that the human brain represents this rich collection of object and action categories in a relatively low- dimensional semantic space that is mapped systematically across visual and non-visual cortex. To investigate how these semantic representations are modulated by attention we asked subjects to search for "humans" or to search for "vehicles". We then estimated semantic tuning for each voxel under these two conditions of category-based attention. Category-based attention causes many voxels to shift their tuning toward the attended object category. These tuning shifts cannot be described by simple changes in response baseline or gain. These data suggest that the cortical representations of object categories change dynamically in order to optimize processing of behaviorally-relevant stimuli during natural vision.


John Reynolds (Sunday, 9:00 am)

Attention-dependent reductions in response variability: Underlying mechanisms

Salk Institute

For the past thirty years, the central focus of attention research has been modulation of mean firing rate. Recently attention has also been found to reduce neuronal response variability (Mitchell, Sundberg & Reynolds, 2007; Mitchell, Sundberg & Reynolds, 2009; Cohen & Maunsell, 2009), and 80% of the benefit of attention is attributable to this newly discovered form of attentional modulation, with the remaining 20% attributable to changes in mean firing rate. Therefore, in order to understand the neural mechanisms of attention, one must understand (1) what gives rise to response variability and (2) how does attention reduce this variability. We have therefore developed a conductance-based model to account for both the emergence of variability and its reduction by attention. This model makes two novel predictions: (1) that attention will reduce the tendency of neurons to fire action potentials in bursts and (2) that attention will reduce the amplitude of the neuronal action potential. We find that both of these novel forms of attentional modulation hold in macaque Area V4.


Jackie Gottlieb (Sunday, 9:40 am)

Principles of attentional control

Columbia University

Intensive neurophysiological research in non-human primates has elucidated the sensorimotor pathways that lead from a visual stimulus to a shift of gaze. These studies have identified early sensory and late motor mechanisms, as well as intermediate mechanisms that seem to encode eye movement decisions and provide top-down control. These studies bring us to the next major question: what is the basis of the top-down control? How does the brain assign relevance and recognize attention-worthy cues? Psychophysical and computational studies suggest that three important factors influence such decisions: the learned associations between stimuli, tasks and actions; the utility of a stimulus in conveying information; and the emotional quality of a cue, whether it brings “good” or “bad” news. Experiments in our laboratory suggest that neurons in the lateral intraparietal area (LIP) are sensitive to all three factors. I review these neuronal responses and their implications for attentional control.


Eileen Kowler (Sunday, 11:00 am)

Exploring the environment with saccadic eye movements and visual attention

Rutgers University

Visual scenes contain far too much information to be apprehended in a single glance. Limitations come from several factors, including the decline in visual resolution with distance from the fovea, the interference produced by crowding, and the inability to identify or encode multiple visual objects or features within the same brief glance. These limitations mean that effective vision depends on both saccadic eye movements and perceptual attention to select the objects, features or regions with the greatest momentary need for limited processing resources. Some approaches to saccades and attention have emphasized the strong links between the two processes in space and time, while others have emphasized their independence. This talk will present recent results from experiments that measure perceptual performance during saccadic sequences while varying memory load, perceptual load, target contrast, external noise and the cues available to guide saccades. The results show strong links between spatial attention and saccades that affect the quality of the perceptual representations, the accuracy of saccades and the contents of visual short-term memory. By contrast, feature-based attention operates largely independently of saccadic planning. The net result is that spatial attention can ensure accurate saccades and facilitate seamless transitions between glances, while attention based on features can contribute to longer-range selection of useful places to look. (Collaborators: Min Zhao, Barbara Dosher and Timothy Gersch).


Leslie C. Osborne (Sunday, 1:40 pm)

Connecting cortical sensory information to behavioral performance in smooth pursuit

University of Chicago

Performance in sensorimotor behaviors guides our understanding of many of the key computational functions of the brain: the representation of sensory information, the translation of sensory signals to commands for movement, and the production of behavior. Eye movement behaviors have become a valuable testing ground for theories of neural computation because the neural circuitry has been well characterized and eye movements can be tightly coupled to cortical activity. We use an eye movement behavior called smooth pursuit as a model for testing theories of sensory coding. Pursuit is a natural behavior in which we move our eyes along with a visual target in order to stabilize its image on the fovea, a region of the retina with high spatial acuity, so we can see it in detail. There is a performance cost for errors in pursuit ˆ failure to match the eye to target movement will cause the retinal image to slip and impair visual acuity. The visual signals that guide pursuit arise from cortical area MT. Neurons in area MT provide information about a target's movement with the first few spikes of their responses on the timescale relevant to pursuit. I will present evidence that sensory errors in target motion estimation can be “read out” from pursuit movements which makes pursuit an ideal testing ground for theories of sensory coding at the level of neural populations in MT.


Alex Huk (Sunday, 2:20 pm)

Geometry and ecology of 3D motion perception

The University of Texas at Austin

The binocular perception of 3D motion through depth has been shown to be mediated by 2 distinct mechanisms: one that follows changes in conventional binocular disparity over time, and one that is based on an inter-ocular comparison of monocular velocities. Although these disparity- based and velocity-based cues are geometrically equivalent in most natural cases, they are isolatable in the lab using specialized stimuli. In this talk, I will review our prior work that isolates and compares the two binocular cues for 3D motion, and then I will extrapolate these results to more ecologically-valid environments. Our initial findings motivate the hypothesis that the disparity-based cue is relied on for slow-moving objects that are foveated, whereas the velocity- based cue is used for a broader range of speeds and eccentricities. I will then describe the results of a geometric analysis which compares our laboratory stimuli used to isolate the velocity-based cue with the actual retinal projections incurred by a realistic moving object. This analysis characterizes additional sources of information available in naturalistic scenes, but also reveals that the inter-ocular velocity difference per se is invariant across simplified laboratory stimuli and realistic natural geometry. Together, these analyses should pave the way for generalizing the well- studied motion and disparity streams to natural scenes and tasks without loss of rigor.