English abstract
The thesis covers the concept of unsupervised learning algorithms. These include different
clustering methods such as K-means, Fuzzy C-means, and Hierarchical clustering. The
models are utilized to classify wells in the oil and gas sector. Firstly, the study highlights
the essence of accurately classifying wells in the context of operational efficiency in
addition to the benefits of applying machine learning (ML) in well classification. It
assesses how unsupervised models can be used to transform operations in the energy
industry. The investigation aims to determine the comparative benefits of each clustering
algorithm in the energy sector. Some core constraints highlighted under each model
include reservoir characteristics, well settings, and operational variables. These attributes
determine the efficiency of each algorithm for well classification. The thesis also
highlights the significance of feature selection when undertaking unsupervised learning,
such as enhancing clustering precision and model efficacy.
The study methodology follows a concise process involving multiple steps starting from
data collection to model evaluation. Each algorithm is examined based on the performance
classification of the well data. The analysis focuses on validating clustering models
through various evaluation metrics, as well as the advantages and limitations of each
model. The investigation establishes that K-means is simple and effective, while
Hierarchical clustering allows one to extract nuanced insights into the structure of a
dataset. Fuzzy C-means enables soft-clustering yet struggles with defining separated
clusters.
The study establishes that all the models have a transformational potential for the industry.
As such, unsupervised learning effectively addresses the sector's classification needs.
Future investigations must delve deeper into algorithm enhancement, integration with
supervised learning, and alternative clustering models. The current findings can be
subjected to further validation studies and applied to real-world datasets. It will increase
operational efficiency and enhance decision-making in the oil and gas energy industry.