LNCS Homepage
ContentsAuthor IndexSearch

Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition

Dominik  Scherer, Andreas  Müller, and Sven  Behnke

University of Bonn, Institute of Computer Science VI, Autonomous Intelligent Systems Group, Römerstr. 164, 53117 Bonn, Germany
scherer@ais.uni-bonn.de
amueller@ais.uni-bonn.de
behnke@cs.uni-bonn.de
http://www.ais.uni-bonn.de

Abstract. A common practice to gain invariant features in object recognition models is to aggregate multiple low-level features over a small neighborhood. However, the differences between those models makes a comparison of the properties of different aggregation functions hard. Our aim is to gain insight into different functions by directly comparing them on a fixed architecture for several common object recognition tasks. Empirical results show that a maximum pooling operation significantly outperforms subsampling operations. Despite their shift-invariant properties, overlapping pooling windows are no significant improvement over non-overlapping pooling windows. By applying this knowledge, we achieve state-of-the-art error rates of 4.57% on the NORB normalized-uniform dataset and 5.6% on the NORB jittered-cluttered dataset.

LNCS 6354, p. 92 ff.

Full article in PDF | BibTeX


lncs@springer.com
© Springer-Verlag Berlin Heidelberg 2010