Font Size: a A A

The Session Identification Method In Web Using Mining

Posted on:2013-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:D YuFull Text:PDF
GTID:2248330377960833Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the internet, especially electronic-commerce rapid devel-oped, the information on web appears “explosive” growth, and the network has begunto gradually change people’s habits and work methods, this also provide new marketsand market strategy for businesses. Web usage mining mainly by analysis web serverlogs, by analysis the browse behavior of networker users, and then find useful infor-mation and patterns of hidden user. To guild the validity and relevance of elec-tric-commerce activities, such as web design, personalized service and business deci-sions, and so on. Conversation identification is not only the difficulty of web usagemining, but also the foundation and key work of user access behavior, its quality has adecisive influence on accuracy of web usage mining. In view of limitation of the cur-rent conversation identification method, this paper depth research in conversationidentification respectively from the clustering and user access behavior characteris-tics.On the one hand, the paper aimed at the divide conversation shortage whichbased on heuristic algorithm before. Pass to set up definitely optimization model todivide the conversation by the cluster perspective. The improved K-means clusteringalgorithm is used for clustering conversation. According to statistics determine theinitial clustering centers and K value, then improved algorithms for improve the algo-rithm accuracy by the feature to record time sequence,.On the other hand, adopt identification algorithm of the merge and split accord-ing to the characteristics of the access behaviors by user. Implementing merger ofconversation by the basic idea of hierarchical clustering algorithm, and then by thesecond identification of conversation to combined with the necessary separation,thereby increasing the accuracy of session identificationThis paper has the certain practical significance of the research about sessionidentification algorithm in this article. First of all, it has a certain reference value dueto applied the clustering algorithm to the session identification field, and rich the ap-plication domain of clustering algorithm to a certain extent; secondly, it also promotedthe analysis of the web access behavior, personalized recommendation and optimiza-tion of the website structure and other aspects of practical research.
Keywords/Search Tags:WEB usage mining, conversation identification, K-means algorithm, Hi-erarchical clustering algorithm
PDF Full Text Request
Related items