Graph based click-stream mining for categorizing browsing activity in the World Wide Web | Posted on:2005-06-17 | Degree:M.S | Type:Thesis | University:The University of Texas at Arlington | Candidate:Maniam, Abhilash | Full Text:PDF | GTID:2458390008977827 | Subject:Computer Science | Abstract/Summary: | | This thesis addresses the following question: Is there an inherent structure in the way users browse the Web? In terms of machine learning this can be rephrased as follows: Can we learn a concept that describes certain Web browsing activity like buying a digital camera, which is different from a concept describing another browsing activity like going through technology news using the click-stream of these browsing activities? A graph-based data mining tool SUBDUE, is used for learning such discriminating concepts from Netscape 7.1 browser click-streams. We have developed a component for Netscape 7.1 which logs a user's click-stream without hampering their browsing experience. These click-stream log files are converted into directed graphs which represent the browsing activity of the user. Since a click-stream log file is semi-structured in nature, a graph is an appropriate choice for representing it. We discuss various ways of constructing the click-stream graph and methods for adding additional contextual information to the graph to aid in SUBDUE's learning algorithm. We present results generated from synthetic click-streams and browser click-streams.; Our results demonstrate that SUBDUE is capable of learning recursive rules (which describe a structural pattern) for classifying client-side click-streams. The accuracy of the structural pattern for classifying (as "search" click-streams or "random browser" click-streams) client-side click-stream logfiles is greater than the accuracy of a decision tree classifier. | Keywords/Search Tags: | Click-stream, Browsing activity, Graph | | Related items |
| |
|