Font Size: a A A

Web Logs Based Data Mining

Posted on:2004-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:H H WuFull Text:PDF
GTID:2168360095955406Subject:Computer technology and applications
Abstract/Summary:PDF Full Text Request
Data mining it is neither one a kind of brand-new information technologies that appear with development of the data base and artificial intelligence technology in recent years nor too computer science and technology, especially the subject that solve urgently by the development and use put forward of computer network. Data mining is a process that draw modes in data, and let people have the ability to know the sterling worth of the data finally, namely information and knowledge in it. The technology of data mining, make enterprise find law that imply of data, offer reliable basis for making policy in the enterprise. For web have a lot of half structure data, and data mining must base on the good structure datum foundation. We can say, Even if get some relevant data on web, use it for mining and analyzing that it is quite difficult. Oriented the web mining is more complicated than oriented warehouse data mining individual. Traditional data base have certain data models, and can descript the data according to model. At the same time have been very good that definition and explanation correlated query languages. For the wide applications of www and internet, appear on the basis of different construct data mining by the data of source, such as text mining, time sequence data mining, and the data of electronic business system mining. Follow the development of the technology of the data base, Multimedia data base mining, space data base mining etc have aroused the attention of a lot of people too.The swift and violent development of Internet, especially the whole worlds of Web popularizes and Web incomparably abundant amount of information. Through Web mining, we can draw necessary knowledge from Web page: to analyze the contents to total user receive and visit behavior and frequentness, we can get the general knowledge of behavior and mode of users, and use that to improve our web serve. And more importance, through the understanding and analyzing of user's characteristic, it can help and develop the electronic commercial activities.Web mining have very great different from traditional data mining, traditional data mining target mainly structure data. It is seldom the one that have heterogeneous and non-structure information to deal with. So there is great challenge in web mining, these makes Web mining become a new theme of data mining, cause people's great interest.Web have the data information of magnanimity on the web, how to use these data, become now the research focuses of data base technology, Data mining findregular content that imply of a large amount of data. Solving application quality problems of the data, utilizing useful datum fully, discarding useless data, it is the most important applications of data mining technology.Web determines Web mining the variety of the task by the varieties of information. Web content mining emphasis is page classifying and cluster. A main direction of content mining is text mining. Web structure mining intends to explain which contain useful modes of this structure information. Hyperlink reflects the relation of quoting of file also. The numbers of times that quote page represent this page importance. The web page's URL also reflects page's type and the catalogue structure relations of page.Based on the discussion on the problem of the web logs data mining, a novel adaptive model, called PCWS, is provided in this paper. This model takes full advantage of the existing algorithm, and can be adapted for different user groups to facilitate to visit web pages. Finally, by utilizing the proposed model, the data-preprocessing procedures are given in detail, and user recognition, session identification, sequential pattern recognition and path mining as well, and some experimental results were presented.
Keywords/Search Tags:data mining, web logs mining, sequential pattern recognition, adaptive web site
PDF Full Text Request
Related items