Font Size: a A A

Research And Application Of Web Data Mining Based On Click-stream

Posted on:2012-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y D XuFull Text:PDF
GTID:2218330338471006Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, network has become a window to understand the world. WWW (World Wide Web) provides a wealth of information for people, and also lefts a large number of access information of users. How to get the valuable information and knowledge from the wealth of data is the subject of our study-Web data mining.Web data mining is data mining applications in the Web environment. It analysis the content of the document, the use of available resources and the directly, relationship of resource to find effective, novel, potentially valuable, and ultimately can be understand patterns and rules. Web Data Mining were divided into Web Content Mining, Web Structure Mining and Web log mining, according to the study of web objects.Web log mining is an important research topic of Web data mining. The object of study of Web log mining is Web logs. The data mining results can provide personalized service, website optimization, improve system performance, determine users Class for e-commerce sites and provide decision support for the leadership.Click-stream means the "left" every click which stored on the Web server log files when the visitors continued access the network. Click-stream records faithfully customers a variety of behavior. For example, every site and every page the user visited the length of time the user stayed in the site or page, sequence of user browsing the page and so on. By collecting the user's click stream during the session can be used to track the links which user has visited, including the source of the site, the site's path and the target site eventually reach. It provides valuable information for anglicizing and studying the user's interest.For college students in terms of the increasingly tense, how to better promote their own network platform, how to attract the attention of the majority of candidates has become the focus of the current college entrance. The decision-makers in terms of enrollment should know the area which has larger visits and the information on the Website which has more clicks. The web designer should know what kind of the performance bottleneck of the Website and security holes, what kind of information can attract more users to access, etc. Therefore, how to gets potential users' information quickly and accurately in the "mass" of the click-stream information has become a focus of Web log mining technology.Click-stream data warehouse is an eventful type of data warehouse. The click-stream data of website is the main source of Click-stream data warehouse.Click-stream data warehouse construction purpose is to analyze the potentially useful information of access behavior of web users to decision support for the website operators. It is required to collect, organize and convert these Click-stream data, establish a variety of dimensions on web click information combined with data mining technique.SQL Server 2005 is Microsoft's next-generation data management and business intelligence platform. In business intelligence, SQL Server 2005 provides three major services:SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS).ETL is a process to extract, transform and load data from operational systems. The data scattered or standards are not uniform from the enterprise's heterogeneous data sources is extracted, cleaned, transformed and load into the data warehouse.It is the purpose of ETL.This article-Research and application of web log mining based on click-stream is to study the theory and methods of web log mining techniques. The study object is the web log data base on Anhui Vocational College of Defense Technology website system, and admissions website with the actual data created a click-stream data warehouse after pre-experimental the data.It's purpose is to get potential users' information, through analysis the web logs. And provides decision support for college leaders to how to promote their college efficiently by the website, and to analyze college students where from, and to provide suggestions to the administrator regarding the structure of the Website. (1) It provides reliable data preparation for the click-stream data warehouse by preprocessing the data of click-stream data sources using.NET language, such as: log filtering, user identification, session identification. For user identification, the paper considering the advantages and disadvantages of various methods, based on Agent, Session, and IP addresses.(2) Data mining has a log of methods and tools. SSIS is a new component of SQL Server 2005. It provides functionality and performance required by enterprise-class data integration application and visual debugger advantages, In this paper, data extraction, transformation and loading work are based on SSIS (an ETL tool).And applying the result to the Anhui Vocational College of Defense Technology's Admissions Website and the technical analysis to support applications. In the basic dimensions of the processing, use the "youdao" domain analysis interface resolved IP address to the regional dimension of the map.(3) In this article, cubes are created by the Analysis Services 2005, and deployed in Analysis Services 2005 database. Finally completed the BI front-end display.
Keywords/Search Tags:web log mining, click-stream, click-stream data warehouse, Business Intelligence
PDF Full Text Request
Related items