Font Size: a A A

The Study Of Clustering Web Users Based On User's Browsing Path

Posted on:2010-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:X Y MaFull Text:PDF
GTID:2178360275952292Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of networking technology,the Web is widely used in information sharing, e-commerce and online services.More and more people use the Intemet to search information they need.How to help users find the information they need quickly and to meet the Web users' personalized needs are important issues that modern network technology should concern about.To solve this problem,researches bring forward a Web user clustering method.By classified the similar users,Web user clustering can understand users' needs and interests,then provide users with better service.Web user clustering is primarily mining based on Web log,includes three steps:First,extracting the log user characteristics by Pro-processing Web log,then calculating users' similarities based on their characteristics,at last clustering the users.The two steps of extracting Web users' character and calculating similarity between users are so important that they can influence the effect of the users' clustering straightly.At present,the user's characteristics are typically expressed by the path of the user's session or the target page of using affair identification.But they all have disadvantages,the granularity of users' session path is to long,and the target page is also difficult to express the user's browsing behavior.With regard to calculating users' similarities,methods that have been is mainly computing intersection among aggregations,there are also methods use the average stay time to calculate similarity between users.But those methods are not very good to explore the user's real interest.Aiming at the disadvantages of those methods,this paper presents a new method to express users' characteristics.The method extracts path of using affair identification to express users' characteristics,the granularity of affair path is finer than users' session path,and it also makes up for the lack of target page,so it can find a good user behavior.And based on the new characteristics' expression method,this paper presents a new similarity calculation method--W'USC(Web User Similarity Calculating).The method treats users' affair path as an orderly sequence,taking the relationship between the same path and the entire path of the users,and fully integrating browsing time of the pages of users' affair path.The method to calculate the browsing time of the page is to use accessing time of next page to minus the accessing time of visiting page.At last,based on the new method of users' characteristics expression and WUSC,the paper uses UBPC algorithm to achieve the Web user clustering based on user browsing path,and have a comparison experiment.The experiment proved that:calculated the similarity of the users by method of this paper is closer to true,and being able to improve the effect of clustering Web users.In the end, we present the research emphasis in the future.
Keywords/Search Tags:data mining, similarity, Web usage, user clustering, users' browsing path
PDF Full Text Request
Related items