Spatiotemporal distortions of visual perception at the time of saccades,J Neurosci, 42 (29), 13147-13157. 

Both space and time are grossly distorted during saccades. Here we show that the two distortions are strongly linked, and that both could be a consequence of the transient remapping mechanisms that affect visual neurons perisaccadically. We measured perisaccadic spatial and temporal distortions simultaneously by asking subjects to report both the perceived spatial location of a perisaccadic vertical bar (relative to a remembered ruler), and its perceived timing (relative to two sounds straddling the bar). During fixation and well before or after saccades, bars were localized veridically in space and in time. In different epochs of the perisaccadic interval, temporal perception was subject to different biases. At about the time of the saccadic onset, bars were temporally mislocalized 50-100 ms later than their actual presentation and spatially mislocalized toward the saccadic target. Importantly, the magnitude of the temporal distortions co-varied with the spatial localization bias and the two phenomena had similar dynamics. Within a brief period about 50 ms before saccadic onset, stimuli were perceived with shorter latencies than at other delays relative to saccadic onset, suggesting that the perceived passage of time transiently inverted its direction. Based on this result we could predict the inversion of perceived temporal order for two briefly flashed visual stimuli. We developed a model that simulates the perisaccadic transient change of neuronal receptive fields predicting well the reported temporal distortions. The key aspects of the model are the dynamics of the “remapped” activity and the use of decoder operators that are optimal during fixation, but are not updated perisaccadically.

Temporal mechanisms of multimodal binding,Proc Biol Sci, 1663 (276), 1761-1769.

The simultaneity of signals from different senses-such as vision and audition-is a useful cue for determining whether those signals arose from one environmental source or from more than one. To understand better the sensory mechanisms for assessing simultaneity, we measured the discrimination thresholds for time intervals marked by auditory, visual or auditory-visual stimuli, as a function of the base interval. For all conditions, both unimodal and cross-modal, the thresholds followed a characteristic ‘dipper function’ in which the lowest thresholds occurred when discriminating against a non-zero interval. The base interval yielding the lowest threshold was roughly equal to the threshold for discriminating asynchronous from synchronous presentations. Those lowest thresholds occurred at approximately 5, 15 and 75 ms for auditory, visual and auditory-visual stimuli, respectively. Thus, the mechanisms mediating performance with cross-modal stimuli are considerably slower than the mechanisms mediating performance within a particular sense. We developed a simple model with temporal filters of different time constants and showed that the model produces discrimination functions similar to the ones we observed in humans. Both for processing within a single sense, and for processing across senses, temporal perception is affected by the properties of temporal filters, the outputs of which are used to estimate time offsets, correlations between signals, and more.

Auditory dominance over vision in the perception of interval duration,Exp Brain Res, 1 (198), 49-57.

The “ventriloquist effect” refers to the fact that vision usually dominates hearing in spatial localization, and this has been shown to be consistent with optimal integration of visual and auditory signals (Alais and Burr in Curr Biol 14(3):257-262, 2004). For temporal localization, however, auditory stimuli often “capture” visual stimuli, in what has become known as “temporal ventriloquism”. We examined this quantitatively using a bisection task, confirming that sound does tend to dominate the perceived timing of audio-visual stimuli. The dominance was predicted qualitatively by considering the better temporal localization of audition, but the quantitative fit was less than perfect, with more weight being given to audition than predicted from thresholds. As predicted by optimal cue combination, the temporal localization of audio-visual stimuli was better than for either sense alone.

Meaningful auditory information enhances perception of visual biological motion,J Vis, 4 (9), 25 21-27.

Robust perception requires efficient integration of information from our various senses. Much recent electrophysiology points to neural areas responsive to multisensory stimulation, particularly audiovisual stimulation. However, psychophysical evidence for functional integration of audiovisual motion has been ambiguous. In this study we measure perception of an audiovisual form of biological motion, tap dancing. The results show that the audio tap information interacts with visual motion information, but only when in synchrony, demonstrating a functional combination of audiovisual information in a natural task. The advantage of multimodal combination was better than the optimal maximum likelihood prediction.

Pooling and segmenting motion signals,Vision Res, 10 (49), 1065-1072.

Humans are extremely sensitive to visual motion, largely because local motion signals can be integrated over a large spatial region. On the other hand, summation is often not advantageous, for example when segmenting a moving stimulus against a stationary or oppositely moving background. In this study we show that the spatial extent of motion integration is not compulsory, but is subject to voluntary attentional control. Measurements of motion coherence sensitivity with summation and search paradigms showed that human observers can combine motion signals from cued regions or patches in an optimal manner, even when the regions are quite distinct and remote from each other. Further measurements of contrast sensitivity reinforce previous studies showing that motion integration is preceded by a local analysis akin to contrast thresholding (or intrinsic uncertainty). The results were well modelled by two standard signal-detection-theory models.