English abstract
As part of, and in effect of the 4th industrial revolution, development in many sectors
has been moving towards automation. More and more, applications and systems
require less human intervention and have grown to be capable of making unassisted
decisions. These applications or systems have been labeled intelligent. This
“intelligence” is the result of their ability to “learn”, using appropriate machine
learning (ML) techniques. Convolutional Neural Networks (CNNs) are applications
of these machine learning techniques.
Pooling is a major component of Convolutional Neural Networks (CNNs), having
considerable influential on the learning process of classification models, and CNNs
in general. pooling functions in reducing the spatial dimensions of feature maps
and, consequently, the number of parameters to be learnt by the CNN. This reduces
the computational cost and prevents overfitting. Many pooling techniques were
proposed in the literature, such as Max Pooling (MaxPool), Average Pooling
(AvgPool), Stochastic Pooling, and Spectral Pooling. Each of these techniques has
its advantages and limitations.
Depending on the type of data, and structure of the CNN, some pooling methods
are better suited to certain situations when compared to their counterparts. Some
standard or popular pooling methods are the Maximum Pooling (MaxPool) and the
Average Pooling (AvgPool). MaxPool is very efficient in extracting the most
important features within a receptive field and is robust to minor translations and
rotation of the input image, while AvgPool tends to preserve the background
information and is robust to outliers. However, these methods present some
limitations. MaxPool, when choosing the maximum value, disregards the other nomaximum values, while AvgPool considers the whole of the input in equal
importance. This encouraged the development of more pooling methods to satisfy
specific CNN needs.
This thesis proposes the addition of a parameter to the pooling process through a
pooling kernel. The proposed pooling methods are adaptive versions of the two
standard or popular existing methods. This investigation, and throughout this thesis,
will answer how this addition affects the pooling process and the subsequent CNN
model performance.
Through this thesis, experimentation, and the produced results, it is found that
adding parameters to the pooling process produces better results than the standard
aforementioned pooling methods.