Font Size: a A A

Research On Semantic Web Usage Mining

Posted on:2014-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:M M WangFull Text:PDF
GTID:2268330422463454Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development and wide application of web technology, the number ofweb pages on the Internet is growing with an exponential speed, then how to combine thesemantic knowledge and the mining methods to analyze users’ behavior has become animportant research direction. Here we use the web logs and the content of web pages to dothe research, which includes the calculation of semantic distance, the semantic miningalgorithm and the semantic similarity of the resulting sequences.On the computation of semantic distance, we add the probability which the Web pagesoccurred in the log we analyzed to the existing method. Then the semantic weight of aWeb page will be influenced by three factors: the depth of the page in the ontology tree,the count of its child pages and its own probability. The semantic distance can becalculated by the semantic weights information and the experiments show that thisimproved method can get more reasonable results.On the semantic mining algorithm, basing on the existing theory that using the semanticdistance to do the mining mentioned by Mabroukeh and considering the time correlationof the log, here we add the semantic distance which was calculated by the above methodto the AprioriAll algorithm which was improved by Wu Haiyan. On the linking step of thealgorithm, we should make sure the semantic distance between the pages be less than themax distance, or it cannot be linked. The experiments show that the improved algorithmcan get more semantically related sequences and it also has a decreased computing scaleand less executing time.On the semantic similarity analysis of the resulting sequences, a new definition of thepage ontology was given on the basis of the ontology and ontology mapping theory. In thisdefinition, web pages are organized by the page preamble and keywords, the semanticsimilarity value can be obtained by calculating the preamble, keywords and other ontologyfactors, and we can use this value to validate the semantic relevancy of the resultingsequences.
Keywords/Search Tags:network page, page ontology, sequential pattern mining algorithm, semanticdistance, semantic weight
PDF Full Text Request
Related items