Computer Vision: Global Average Pooling

A typical classifier, except the last two/few layers, is nothing but a feature extractor.

A feature extractor is made of many convolution layers with activation functions and occasionally spatial compression operations called MaxPooling. It produces multiple feature maps that flow down the network.

During training, the feature extractor learns to represent the important features in the image (of objects in the image) in different feature maps. While the first few layers are limited to simple features like edges and simple shapes, the…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store