Along with the development of Internet , more and more companies begin to establish web sites to carry on E-business. A lot of clickstream data has been accumulated during the operation of the web sites. So far, due to the different definitions and patterns of these data, clickstream data is mainly used to analyze the condition of web site servers. These data can not acquire strategic information for users yet. How could we gain our strategic information, make scientific and proper judgments from these data? How to translate clickstream data into strategic information effectually? These problems needs to be solved urgently by now. Web sites use these clickstream data to progress data analysis and data mining, and the Clickstream Data Warehouse is just about the solution.This research first introduce basic concepts of Data Warehouse, particularly present the architecture of Data Warehouse,data organization structure and data model. On that basis, this article research E-business,Clickstream Data Warehouse and the relationship between them. Author intensively analyze data sources web logs of Clickstream Data Warehouse, compare the three forms of web logs: CLF,ECLF and ExLF, and filially decide to use EXLF. This thesis discuss how expand Web logs file for enterprise to keep more useful clickstream data.After comparing the structure of Corporate Information Factory(CIF) and Mutildimensional Architecture(MD), author decide the architecture of the Data Warehouse. in order to achieve the purpose of analyze data in multi-size, multi-level way, author intensively analyze and design the logic data model and models of this Data Warehouse, then detailed logic design and physical design on the models of the corresponding dimensional table and facts Table.Introduced a data warehouse ETL process. Because of the complexity of ETL process and the special nature of Web log, this study will be detailed information on how to convert data from the Web log of the process into a data warehouse, finally SSIS technology accomplish data conversion ,from data source to the data warehouse. Discussed the OLAP data processing methods,realization and analysis of the basic operation, introduced SSAS technology and the Proclarity software. At last using SSAS created multidimensional data sets and related analytical operation, used Proclarity software to access the data of multi-dimensional data sets, and provided the results of analysis to users in a visual way.This study successfully integrate click stream data into a data warehouses, and gain certain strategic information from these data. The creation of this system provide a template system for other e-business web site while they want to create a clickstream data warehouse, and have a certain significance. |