The advantages of coarse-to-fine processing – a Computer Vision Approach

A Brilhault1, R Guyonneau2, S Thorpe1

1CerCo, Universite Toulouse 3 - CNRS, France
2R&D, Spikenet Technology, France

Contact: simon.thorpe@cerco.ups-tlse.fr

The idea that the visual system can work more efficiently if it uses a "coarse-to-fine" processing strategy is a popular one. Here we used extensive testing of object and scene recognition with a biologically-inspired computer vision system developed by SpikeNet Technology SARL (http://www.spikenet-technology.com) to demonstrate that these advantages are very real. The standard SpikeNet recognition process typically uses image patches roughly 30 pixels across. This gives good selectivity combined with reasonably high robustness to image transformations such as rotation, size changes, and 2- and 3D transformations. Using smaller patch sizes allows the system to detect a given target over a wider range of transformations, but with less selectivity. Thus 30px models achieve about 10° tolerance to rotations, where the 18px ones can go up to 20°. However, if we combine an initial processing phase using a relatively coarse image patch (e.g. 18 pixels across), coupled with a second processing phase using image patches 50% wider, recognition is not only more robust, but also a lot more efficient. This translates into using fewer neurons than would be needed with the standard approach, and also means that the software implementation runs 4-5 faster on standard computer hardware than the original algorithm.

Up Home