Software

Antonio J. Rodriguez-Sanchez

Software

Antonio J. Rodríguez-Sánchez, John K. Tsotsos, The roles of endstopped and curvature tuned computations in a hierarchical representation of 2D shape. PLoS ONE 7 (8), pp. 1–13, 2012.

Matlab program
Software Licence: GNU LESSER GENERAL PUBLIC LICENSE V3.0

Link to paper

Abstract

That shape is important for perception has been known for almost a thousand years (thanks to Alhazen in 1083) and has been a subject of study ever since by scientists and phylosophers (such as Descartes, Helmholtz or the Gestalt psychologists). Shapes are important object descriptors. If there was any remote doubt regarding the importance of shape, recent experiments have shown that intermediate areas of primate visual cortex such as V2, V4 and TEO are involved in analyzing shape features such as corners and curvatures. The primate brain appears to perform a wide variety of complex tasks by means of simple operations. These operations are applied across several layers of neurons, representing increasingly complex, abstract intermediate processing stages. Recently, new models have attempted to emulate the human visual system. However, the role of intermediate representations in the visual cortex and their importance have not been adequately studied in computational modeling.

This paper proposes a model of shape-selective neurons whose shape-selectivity is achieved through intermediate layers of visual representation not previously fully explored. We hypothesize that hypercomplex - also known as endstopped - neurons play a critical role to achieve shape selectivity and show how shape-selective neurons may be modeled by integrating endstopping and curvature computations. This model - a representational and computational system for the detection of 2-dimensional object silhouettes that we term 2DSIL - provides a highly accurate fit with neural data and replicates responses from neurons in area V4 with an average of 83% accuracy. We successfully test a biologically plausible hypothesis on how to connect early representations based on Gabor or Difference of Gaussian filters and later representations closer to object categories without the need of a learning phase as in most recent models.

Thomas Hoyoux, Antonio Rodríguez-Sánchez, Justus Piater, Can Computer Vision Problems Benefit from Structured Hierarchical Classification?. Machine Vision and Applications 27, pp. 1299–1312, 2016.

Program
Software Licence: GNU LESSER GENERAL PUBLIC LICENSE V3.0

Link to paper

Abstract

Research in the field of supervised classification has mostly focused on the standard, so-called “flat” classification approach, where the problem classes live in a trivial, one-level semantic space. There is however an increasing interest in the hierarchical classification approach, where a performance gain is expected by incorporating prior taxonomic knowledge about the classes into the learning process. Intuitively, the hierarchical approach should be beneficial in general for the classification of visual content, as suggested by the fact that humans seem to organize objects into hierarchies based on visually perceived similarities. In this paper, we provide an analysis that aims to determine the conditions under which the hierarchical approach can consistently give better performances than the flat approach for the classification of visual content. In particular, we (1) show how hierarchical methods can fail to outperform flat methods when applied to real vision-based classification problems, and (2) investigate the underlying reasons for the lack of improvement, by applying the same methods to synthetic datasets in a simulation. Our conclusion is that the use of high-level hierarchical feature representations is crucial for obtaining a performance gain with the hierarchical approach, and that poorly chosen prior taxonomies hinder this gain even though proper high-level features are used.

Hanchen Xiong, Antonio Rodríguez-Sánchez, Sandor Szedmak, Justus Piater, Diversity priors for learning early visual features. Frontiers in Computational Neuroscience 9 (104), 2015.

Program
Software Licence: GNU LESSER GENERAL PUBLIC LICENSE V3.0

Link to paper

Abstract

This paper investigates how utilizing diversity priors can discover early visual features that resemble their biological counterparts. The study is mainly motivated by the sparsity and selectivity of activations of visual neurons in area V1. Most previous work on computational modeling emphasizes selectivity or sparsity independently. However, we argue that selectivity and sparsity are just two epiphenomena of the diversity of receptive fields, which has been rarely exploited in learning. In this paper, to verify our hypothesis, restricted Boltzmann machines (RBMs) are employed to learn early visual features by modeling the statistics of natural images. Considering RBMs as neural networks, the receptive fields of neurons are formed by the inter-weights between hidden and visible nodes. Due to the conditional independence in RBMs, there is no mechanism to coordinate the activations of individual neurons or the whole population. A diversity prior is introduced in this paper for training RBMs. We find that the diversity prior indeed can assure simultaneously sparsity and selectivity of neuron activations. The learned receptive fields yield a high degree of biological similarity in comparison to physiological data. Also, corresponding visual features display a good generative capability in image reconstruction.