The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traffic and the size and complexity of Web sites. The complexities of tasks suck as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. This paper introduces the technology on the basis of a typical data preprocessing and realizes one kind based on diary request reference file heuristic conversation recognition algorithm; Second, a plan is proposed to improve the clustering algorithm, and comparison is carried on with other algorithms. The algorithm is more applicable to come to the sparse distribution of the large-scale database clustering analysis; final design of a prototype system of Web Mining for a brief comparative analysis model, Application of association rules and clustering algorithm structure of the web site user visits were analyzed. |