Document

Evaluating Correlation-based network models combined with machine learning techniques for discovery in SLE dataset.

Source
Master's thesis
Country
Oman
City
Muscat
Publisher
Sultan Qaboos University
Gregorian
2023
Language
English
Thesis Type
Master's thesis
English abstract
Systemic lupus erythematosus (SLE) is an autoimmune disease that results in widespread inflammation and tissue damage across various organs, such as the joints, skin, brain, lungs, kidneys, and blood vessels. Though there is no known cure for lupus, medical interventions and lifestyle changes can help manage the disease. In recent years, the availability of data and electronic medical records of patients has provided unprecedented opportunities to detect correlations in certain disease states, such as high blood pressure that may lead to heart disease. This data and medical records can be utilized to generate new research hypotheses for future studies. The objective of our study is to facilitate early SLE diagnosis and reduce the associated time, cost, and effort by building a Correlation Networks Model (CNM) that can assist physicians in diagnosing whether a patient is suffering from SLE or not. We will be applying the model to a local dataset of 1155 patients, collected from the Rheumatology clinic at SQUH (Sultan Qaboos University Hospital) between 2006 and 2020. The dataset comprises information on patient demographics, clinical history and laboratory tests. Aims: (a) to identify severity clusters in Omani patients diagnosed with Systemic Lupus Erythematosus, (b) to detect features that are associated with disease severity, and (c) to investigate the correlation between patients classified as Lupus patients in Oman. Methods: Our approach involves gathering a wide range of data (SQUH), including demographic, clinical, and laboratory data. Once collected, the data undergoes several stages, starting with data cleaning and feature extraction. We then perform data analysis to identify the data types and examine the data distribution to gain a comprehensive view and understanding of the dataset. Two clustering methods, Markov Cluster Algorithm and K-means clustering, are employed to cluster the dataset. Finally, we evaluate the clustering results using a correlation model. Results: The exploratory data analysis shows that females constitute 88% of SLE patients, while males account for 12%. The regions of Muscat and Al Batinah have the highest number of SLE patients, with Muscat having 31% and Al Batinah having 24%. On the other hand, the regions of Al Wasta and Musandam have the lowest distribution of SLE patients. The clustering analysis results showed patients clusters, mild, and severe clusters. Patients in the severe cluster have a higher prevalence of the Anti-dsDNA, Rheumatoid Arthritis and Thyroid stimulating hormone (TSH).
Category
Theses and Dissertations

Same Subject