English abstract
This work is about breast cancer which is one of the main causes of mortality in women. Early detection remains very crucial to improve survival among breast cancer patients. In this project two main topics are investigated. The first one is related to the automatic detection of breast cancer using complete blood count (CBC). The second one deals with the prediction of survival among breast cancer patients. Both detection and prediction go through two stages, namely feature selection and classification. Different techniques are used for feature selection. These techniques belong to either the filter approach or the wrapper approach. In the classification stage, number of machine learning algorithms are used to classify the data in to two classes, namely cancer and cancer-free for detection version and survive and die for survival prediction version. Last stage is to evaluate the evaluation metrics like: accuracy, sensitivity, specificity and area under the curve (AUC). In the breast cancer detection CBC results at first entry to the clinic give better evaluation metric with 78.8% accuracy and AUC equal to 0.89. On the other hand, in survival prediction, CBC at the diagnosis date gives better evaluation metrics with 90.5% accuracy and 0.93 for AUC. The algorithms are evaluated using CBC data collected and labelled by experts from Sultan Qaboos University Hospital.