Font Size: a A A

Research On Web Usage Mining Methodology Based On ACO

Posted on:2010-09-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:H F LingFull Text:PDF
GTID:1118360302468471Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, the global resource is shared and information is exchanged. However, the problem of information overload and information loss caused by the exponential growth of information on the Internet increasingly limits the user's efficient usage of information resource. Web personalization service offered by web site would improve the user's satisfaction of the access. The critical issue that web personalized recommendation faces is to give a deep understanding of large amount of anonymous users' behavioral model. With the conventional personalization method hard to deal with anonymous users, web usage mining is an effective way to address the above issue for web personalized recommendation. As an important part of web data mining, Web Usage Mining (WUM) is the process of mining users' navigation patterns by the employment of data mining to analyze the log file. Being able to understand the users' navigational behavior, WUM can be ready to provide navigation recommendation service for users. As a branch of Swarm intelligence, inspired by the foraging behavior of real ants, Ant Colony Optimization (ACO) is an intelligent algorithm to mimic the collective behavior of ants. With its advantages in solving the complex optimization problem, ACO is applied in a number of areas. As a result, it is important in both theory and practice to apply ACO in WUM and discover users' navigation patterns and offer navigation recommendation service.This dissertation firstly gives a research on the convergence of ACO and the preprocessing of WUM. Then, we apply ACO to the mining of users' navigation patterns and Web users clustering. The main work and innovative research results are as follows:(1)On the basis of the convergence analysis of Graph-based Ant System, we give an improvement of basis ant algorithm, and give a research on the convergence of the improved ant algorithm. We improved the Ant Cycle model of basic ant algorithm in three aspects: The first is that only the best ant can deposit the pheromone, namely, after the tth iteration, the pheromone on each arc of E evaporates, only the pheromone on the arcs that construct the best solution in the past t iterations can be strengthened, which encourages ant to search in the neighborhood of the best path so far and makes the exploration of solution space more explicitly. The second is the limitation of the remained pheromone amount. In order to avoid the rapid convergence to local optimum, we set the lowest boundary of the remained pheromone amount on each arc. The third is the adaptive change of pheromone evaporation coefficient. On the earlier stage of the algorithm, this change can increase the randomness, which is beneficial to search the better solution. On the later stage, this change can decrease the randomness and increase the convergence speed, which make the algorithm converge gradually to global optimum. Then, the convergence of the improved algorithm is proved. On the basis of only two basic hypotheses, the algorithm is proved to converge to the optimum with the probability near to 1. The experimental results show that compared with basic ant colony algorithm, this algorithm is effective, with the global search ability and convergence speed of this algorithm being higher.(2)On the basis of analyzing the existing process of Web usage data preprocessing, we give a research on the session identification, a critical issue in the process of preprocessing, and propose a session identification approach based on adaptive time threshold. The conventional time-oriented method is limited for its identifying the session only by fixed time threshold. This article employs the dynamic time threshold to session identification, namely, analyzes the average page access time of each user, and combines with the fixed time threshold to get a dynamic time threshold, realizing the personalization of session time. The experimental results show that user sessions based on this approach can describe users real navigational behavior more accurately, give a good effect on pattern discovery, and improve users navigation recommendation based on WUM.(3)On the basis of the similarity between the ant foraging behavior and user navigation behavior, we consider web users as artificial ants, employ the pheromone in ant algorithm to reflect the user's interest and propose an ant navigation model to mine users' interest navigation patterns. Firstly, the influence of such factors as page visit frequency, page visit sequence, site structure and page visit time on mining users' navigation patterns is considered. Secondly, the impact of early visitors and existing visitors on navigatior patterns is discussed. Then, we propose a user navigation model based on ACO to discover user preferred navigation patterns. The experimental results show that the accuracy of user navigation recommendation based on ACO is better than based on conventional algorithm. It shows that navigation paths based on ACO can reflect user navigation preference more accurately.(4)We propose a hybrid methodology that combines ACO with K-means to web users clustering. Firstly, four clustering models based on ant colony behavior are introduced. Then, we apply ACO in Web usage clustering based on ant colony foraging behavior. It is known that ACO is not sensitive to initial process and able to converge to global optimum under certain circumstances, but has slow convergence speed. Compared with ACO, K-means is faster in the convergence speed but would converge to a local optimum. Additionally, the initial clustering being generated randomly, the results of K-means clustering are dependent on the initial process. Accordingly, this article proposes a hybrid algorithm that combines ACO with K-means to the problem of web users clustering. This approach employing both the global search of ACO and the local search of K-means, experimental results show that this method can improve the accuracy of user navigation recommendation greatly compared with K-means.The above research plays an active role in the development of such areas as ACO and WUM in theory, and offers an important policy in practice to improve the effectiveness of customer service such as user navigation recommendation.
Keywords/Search Tags:Web navigation recommendation, Web Usage Mining, Ant Colony Optimization, Convergence, session identification, Users interest navigation patterns, Web user clustering
PDF Full Text Request
Related items