Font Size: a A A

Analyse Of User Browsing Behavior Base On State Transition Model

Posted on:2010-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:S Y HanFull Text:PDF
GTID:2189360272998404Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet and e-commerce development, E-commerce system to provide users with more and more choices,but at the same time ,the structure of it has become more complex . Users are often lost in the large number of goods in the information space, could not find the goods they need, and the businesses can not be clearly aware of the specific needs of users, they can only recommend relatively popular products blindly. The results of it are often counter-productive, the users lose interest of the wet site, and they are gone. This "double-blind" phenomenon has become the key restricting issues of the development of electronic commerce. Web Data Mining is the birth of such a case, which combines with the e-commerce, to resolve the issue of "dialogue" between e-commerce site with users.In this paper, the state transition model which based on Web usage mining technology is proposed, The model is mainly used for the user of e-commerce sites browser behavior analysis, and analysis of the results for the structural design of the site to make timely improvements to look forward to the maximum extent possible to meet the needs of users.Based on the study of the existing literature, this article describes the theory of data mining technology and the characteristics of it. There are three types of web data mining: Web content mining, Web structure mining, Web usage mining. The application of web usage mining techniques to the analysis of e-commerce sites, is the web log mining technology which used in this paper. That is discover the E-customer behavior through the Web log file and related data. Through Web data mining can be found the user's interest in access, access frequency, access time, and accordingly adjust the dynamic page structure, improve service, personalized to the user interface, targeted e-commerce to better to meet the needs of visitors. Then a the state transition model based on Web log mining techniques isproposed. The main features of the model is based on an analysis purpose and function of the site dividean e-commerce site into some states which are low-coupling and high polymerization, every state is similar to a series of functions or the content of a collection of pages, found the user behavior patterns in the transition between different states through Web Log Mining.The state transition model for the accuracy of Web log files is demanding, so the work of data pre-processing is particularly important. Data pre-processing of the State transition model mainly includes the following steps: data cleaning, user unique identification, session identification, path supplementary. The main task of data cleaning is to remove the irrelevant data on the Web server logs with Web log mining, reducing the number of data sources and dimension; user unique identification is mainly identify different user from the cleaning web log file; session identification is to identify a user's browser with acts from time to identify, if a user requests the page the time span of more than a timeout value set, then default the user visited the site many times; When users visit the Web site, for the reasons that the local cache and using a proxy server, the path will lead to incomplete visit, this requires the use of the path added to improve the user's access path, this step requires have sufficient understanding of the topology of the site; after the added of the path, there also need to add the visit time, a simple way to check page access time of two mid-point of the visit as a time to add the page.State transition model is used to analyze the user to enter the site to browse web pages on the site until the entire process from the site of the browser model. The E-customer behavior is analyzed by the two matrixes: transition probability matrix and mean holding time matrix. The transition probability matrix used to describe when the users visit web sites, they how to transfer between various the pages of states, including users of the transfer between different states and a state of internal self-transfer. In addition, in order to research needs, the transition probability matrix is including two dummy states: entry and exit, they are used to describe the user to enter and exit the site status. The two dummy states, should comply with the following few rules:(1) No transition can be made to the entry state from any state;(2) No transition can be made from the entry state to the exit state;(3) No transition can be made from the exit state to any state other than the entry state;(4) A transition can be made from the exit state to itself;(5)If a user enters the web site again, then we can say that is a transition from the exit state to the entry state;(6) If a user stays on one page for too long time, that more than the timeout value we set, then we can say that the user makes a transition to the exit state.As a result of the addition of two dummy state, transition probability matrix is a (n +2) * (n +2) matrix, and the element Pij describes the probability of the user transition from the i state to j state, Pij should have the following attributes:In the mean holding time matrix, we do not consider the transition from the two dummy states (entry and exit) to other state. When the user enter the web site, the transition from the "entry" state to another state will happen immediately, so we can see the time that the user stay in the entry state is 0. And after the user exit the web site, if there is no re-visit to the Web site, then the user can be seen as a permanent stay in the "exit" state, the time is not calculated.The elements of mean holding time matrix Tij describe the mean holding time of the transition from i state to j state, it is average residence time in the j state after the transition From the i state to the j state.Calculate the matrix elements of the two values, analysis, and then found the interest and behavior patterns of E-customer, that is the main research content of this article.At the end of this article, we also propose some development applications which are based on the state transition model, such as the recommendation of the page based on the states, the improvement of the site structure, and further application of the state of the internal state transition model and so on.
Keywords/Search Tags:Web log mining, state, state transition, user browsing behavior
PDF Full Text Request
Related items