Font Size: a A A

Design Of The Data Preprocessing Sub-system For A WEB Usage Mining System

Posted on:2009-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:B G YangFull Text:PDF
GTID:2178360278465864Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Network is getting more and more percentage in global economics in China with the high-speed development of internet in China and the rapid growth of economics in China. More and more invest goes into internet companies and stimulate more birth of new companies and more rapid growth of existing companies. The competition between these companies becomes more and more intensive. How to enhance competitive power of a company and how to attract more customers and keep the existing customers is challenging the strategy of marketing operation of a company. As a solution, precise marketing and database marketing center on customers climb up the agendaTo run effectively precise marketing and database marketing, a web usage mining system is introduced in the thesis. One of the keys during constructing a web usage mining system is collecting and preprocessing the data, the behaviour data of all customers, which consists the main part of the thesis, the design and implementation of data preprocessing sub-system. Data preprocessing, following data analysis, provides the data base of pattern analysis and knowledge discovery. The data source resulted from preprocessing affects directly the quality of knowledge from data mining. It is useless even harmful to mine knowledge from a broken data resource. Garbage in, garbage out. As long as the exactitude, integrality, timeliness and validation is guaranteed, the following analysis work is effective. Thus the key of the thesis is description of the design and implementation of the data preprocessing sub-system for a web usage mining system.The thesis is composed of two parts. Firstly, the concept, branch and recent development of web mining is introduced in the thesis after providing some knowledge and background of data warehouse and data mining. Then the design and implementation of the data preprocessing sub-system is addressed in detail. Especially, as for the design and each key step in implementation, including data collection, ETL process and the design of the data warehouse, particular philosophy of design and methods of implementation is presented.Innovations in the projects include1. providing a complete set of design of a data preprocessing sub-system for web usage mining2. providing a completely new design of a data collection sub-system3. providing a complete design and implementation of ETL process4. providing a hierarchical design of clickstream data warehouse and its metadata management.
Keywords/Search Tags:Datamining, Web Usage Mining, Data Warehouse Web Log, Clickstream
PDF Full Text Request
Related items