Font Size: a A A

Web Site Logical Structure Generating System Based On Web Mining

Posted on:2007-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhengFull Text:PDF
GTID:2178360185474719Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As single website grow more enormous and web hyperlink structure became more complicate,it is very hard for those traditional web analysis mechanism — based upon single web page or pure hyperlink structure analysis.— to functionate satisfactorily. In order to deal with the problem of konwledge expresion,implementation and obtaining while analyzing the web,new web model must be able to reflect the logical relationship not only of single web page but also of the whole web site.Based on website's logical structure model,we propose three web mining algorithm and implement a complete system to automaticly genetate a website's logical structure.By comparative experiments,it shows the website logical structure mining algorithm has higher precision ,stability and flexibility.In this paper,the main work lists as follows:(1) Propose a logical website structure model based on website logical domain and its entrypath.(2) In order to mine out a website's logical structure,propose three algorithms which based on different theory backgrounds.The three algorithms are:website logical domain mining algoritm based on web page cluster, website logical domain mining algoritm based on logical domain core,web logical domain's entrypath mining algoritm based on heuristic rules.(3) Based on the three algorithms,implement a system to automatically generate a website's logical structure. Given a website's entrypage URL address,this system could automatically fetch given quantity of webpages online.Then it will generate a directed graph of thosed fetched back webpages and a information library storing each webpage's coordinate information.Based on those two data structure,the system will lunch those web mining algorithms.Finally,it would generate the whole website's logical structure.(4) Given the system's usarbility,lots of research work was focused on the algorothm's performance.The time complexity of the most time comsuming algoorithm-website logical domain mining algorithm based on logical domain core—was reduced from O(n~3)to k~*O(n~2)(k is a constant).(5) In the experment section,by comparative experiments between the two website logical domain mining algorithms and with the web logical domain mining algorithm...
Keywords/Search Tags:website structure, webpage cluster, website logical domain, entrypath
PDF Full Text Request
Related items