| With the wide development of Internet in the electronic commerce, sharing of information and online services, web has become an effective platform on which people communicate and process information. Based on so increasingly expansion information in network, how to understand users' behavior and discover individual information effectively has become a difficulty to web designers. So technique of web mining emerges as the time requires, and web log mining is an important part in the research field of web mining.This paper is built around Web log preprocessing and Web log mining methods for research. At first, it optimizes the method of data preprocessing traditional user identification. Then, it proposes an affair recognition method that synthesizes three aspects of users browse time, rate and page browsing behavior. At last, considering the fact that most of the traditional clustering analysis work bed with complex data, invalid search, and slow convergence speed and easy to fall into local extremum, this paper proposes a new combination of ant colony algorithm for users to access features of the object structure, makes the information entropy introduced into the LF algorithm. Then, making the results of k-means method as ant clustering algorithm's initial center, which reduces the parameter, effectively solves the slow convergence of ant colony algorithm for the early and improves the accuracy of the algorithm.Experiment turns out that the method in this paper can identify session in which users take long time and few times to visit pages, at the same time, cleans out redundant data of URL page through the interest degrees of users for web log mining methods providing optimal data source.By contrast, the improved ant clustering algorithm proposed in this paper which introduces the information entropy is fast convergence rate, very efficient and very accurate for complex data, thus the system can be effectively identified similar users of viewing pages for users to provide method and strategy about users personalized recommendation and website dynamic structure. |