| Web usage mining mainly focuses on web usage data.The web usage data record visitors' visit information on this web site.We can obtain the browsing behavior and visiting habit of these visitors by analyzing the web logs.It can be used for previously storing of web pages,as well as recombining of these pages and optimizing the structure of the website.The dissertation is mainly to transform the web usage data from the text status to format status.Then based on the data of the example web site,a model of click stream warehouse is built and the data are imported into the database.We analyze the data by the tool provided by SQL 2005 server analysis and we provide suggestions on the structure to the web administrator according to the conclusions of the analysis.The main contents of this dissertation are as follows:1.It summarizes the correlative knowledge and technology of data mining and web usage mining,expatiates the meaning,actuality of research and the existing problems of web usage mining;2.It discusses the three phases of web usage mining:data preprocessing,pattern discovery and pattern analysi.Moreover,the application fields and research directions of web usage mining are analyzed;3.It offers effective algorithms for data cleaning,user recognizing and session recognizing and optimizes algorithm for session recognition.After data preprocessing,the web usage data are fit for the data warehouse and data mining;4.Analyzing the preprocessed data by Deep Log Analyzer and some statistics and graphics are got.Then the necessity of building click stream warehouse is given. Based on the standard click stream warehouse model presented by Mark Sweiger,a customized logical model is put forward and then the physical model.At last the data of the website is transported into respective data tables that provide the data source for data mining;5.The algorithms and the model mentioned above are applied into the data collected from the example website.Then the data are analyzed through sequence clustering model in SQL server 2005.The conclusions are extracted from the data view by analyzer and applied into the optimization of the website. |