Castellotti, S., Montagnini, A., & Del Viva, M. M.
Early visual saliency based on isolated optimal features
Frontiers in Neuroscience, 15, 645743.
Under fast viewing conditions, the visual system extracts salient and simpliﬁed representations of complex visual scenes. Saccadic eye movements optimize such visual analysis through the dynamic sampling of the most informative and salient regions in the scene. However, a general deﬁnition of saliency, as well as its role for natural active vision, is still a matter for discussion. Following the general idea that visual saliency may be based on the amount of local information, a recent constrained maximumentropy model of early vision, applied to natural images, extracts a set of local optimal information-carriers, as candidate salient features. These optimal features proved to be more informative than others in fast vision, when embedded in simpliﬁed sketches of natural images. In the present study, for the ﬁrst time, these features were presented in isolation, to investigate whether they can be visually more salient than other nonoptimal features, even in the absence of any meaningful global arrangement (contour, line, etc.). In four psychophysics experiments, fast discriminability of a compound of optimal features (target) in comparison with a similar compound of non-optimal features (distractor) was measured as a function of their number and contrast. Results showed that the saliency predictions from the constrained maximum-entropy model are well veriﬁed in the data, even when the optimal features are presented in smaller numbers or at lower contrast. In the eye movements experiment, the target and the distractor compounds were presented in the periphery at different angles. Participants were asked to perform a simple choice-saccade task. Results showed that saccades can select informative optimal features spatially interleaved with non-optimal features even at the shortest latencies. Saccades’ choice accuracy and landing position precision improved with SNR. In conclusion, the optimal features predicted by the reference model, turn out to be more salient than others, despite the lack of any clues coming from a global meaningful structure, suggesting that they get preferential treatment during fast image analysis. Also, peripheral fast visual processing of these informative local features is able to guide gaze orientation. We speculate that active vision is efﬁciently adapted to maximize information in natural visual scenes.