Font Size: a A A

Web Log Mining And Its Application Based On Sequential Pattern

Posted on:2015-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:H Q WangFull Text:PDF
GTID:2298330452460278Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and information technology, and rapidexpansion of the information resources of the Internet, people unable to effectively select andassimilate complex information, drowning in a sea of information, this phenomenon is knownas information overload. At present, people use a search engine to retrieve informationresources on the Web, but search results look not smart and friendly enough, and it does nottake the user’s interests and hobbies into account. It is called the phenomenon of the "data-richand knowledge poverty".Web mining is the application of data mining in the Web and acquiring knowledge frominformation resources on the Internet has become a focus of computer science. Web logmining is called Web usage mining, and it is the most important branches of Web mining. Theuse of Web log mining sequential patterns can easily get the user’s access patterns in the Webserver log file. It helps to improve Web design, and provide decision support for sitemanagement, and provide a better experience for users.This paper expounds the processes of Web data mining, data mining, sequential patternmining, and Web log mining. Because the original Web logs contain a lot of noise data, and itwill affect the quality of the data mining results, so this paper preprocess the logs firstly, usingthe Tools such as Apache Log Viewer and Microsoft Visual Studio2005and datapreprocessing functions to clean log and identify session. It provides the data source forbuilding data mining models. Then, this paper uses Microsoft SSAS business intelligence datamining tools SQL Server Analysis Services as experimental tools, and uses MicrosoftSequence Clustering algorithm as data mining algorithms to mine the pre-processed data,andshows the front-end display of the results, and shows the users’ most frequently accessedpages and the user access path based on the sequences pattern. Then, this paper analyses theresults of mining, and proposed four-point to improve the web site, and apply it to theconstruction of the hospital’s site. Practice shows that the average traffic and pa ge views ofthe website improve user’s experiences evidently.
Keywords/Search Tags:data mining, web log mining, sequential pattern mining, SSAS, BI
PDF Full Text Request
Related items