English abstract
The continuous growth in the size and contents of web pages has led to increase the complexity in dealing with the information in a website. The implication of this growth results in having some popular pages being buried deep in the website hierarchy such that users by-pass several other pages before reaching the target page. As a consequence, users find it difficult to access the desired information in a simple and time-saving manner. Thus, developers have to manually reorganize their websites to provide users with highly demanded pages. Web mining offers great contributions to address users' difficulties and assist developers in restructuring their websites. Many research efforts have been conducted to extract useful information from a given website using its contents, structure and user access patterns. This research aims to propose and evaluate a recommendation system that mines user access log file to suggest efficient website restructuring. The raw data of user access log has been prepared and preprocessed using data preprocessing techniques (data cleaning, user and session identification, path completion) in order to be ready for the analysis by the proposed recommendation system to extract user access patterns for a given website. The proposed recommendation system is based on the existing recommendation system FTPW which stands for Frequency & Time based Page Weight algorithm. FTPW algorithm assigned a quantitative weight for each page in user access log by computing three parameters (frequency, time spent on page, page rank value). The frequency and time spent on page was computed from the analysis, of user access log. The standard page rank algorithm was used to measure the page rank value. It analyzed the hyperlink structure of a website and it stated that a page is important or popular if it has more incoming links. It gives static rank values in which the popular page tends to be popular. The proposed recommendation system modified the page rank algorithm that was used by FTPW through using number of visits instead of number of incoming links to a page. The experimental results and evaluation show that when using number of visits instead of number of incoming links, not always the page which has many number of incoming links is important. The importance of a page varies according to users' behavior. Thus, the suggested page rank algorithm gives dynamic results unlike static results in the standard page rank algorithm. The proposed recommendation system assigns quantitative weight to a page using (time spent on page and the value of the suggested page rank algorithm). Also, it considers the depth or level of website hierarchy as a parameter. Therefore, the pages which gain high weight value and are located at higher depth (level 3 or above) can be recommended for restructuring process of a website. The proposed recommendation system will improve the accessibility and reachability to highly demanded pages using a short-cut links to the recommended pages.