Font Size: a A A

Research Of Time Online Evaluation Method Based On Distributed Clustering

Posted on:2017-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:K ChenFull Text:PDF
GTID:2348330566956715Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Internet addiction has significant effects socially,psychologically and occupationally.Spending a large amount of time online is an important feature of the addiction.It's significant to evaluate the amount of time online objectively in a technical way.There are not much technical methods to calculate time spent online.Generally,we get login-time and logouttime from network accounting system,do the subtraction,and take the absolute value as the period online.It is inaccurate,because users may browse nothing from login-time to logouttime.The existing calculation algorithms cannot accurately estimate the time spent online.To solve above problems,this paper presents a density clustering model based on network log.For every account,we cluster the timestamps of requests logged in network egress by appropriate granularity.Every cluster can be seen as a period of time.The sum of periods is the total time spent online of one specific account.Meanwhile,we can dig out the contents that users were browsing from the details of net log.DBSCAN,which is a densitybased clustering algorithm,and spark,which is a fast and general engine for large-scale data processing,are adopt in this model.The experiment shows that the model has good credibility for college network.Distributed computing also significantly improve the efficiency of the model calculations.The accuracy of time online evaluation is well soluted.At the same time,there are also problems that the clustering parameters are not automatically optimized,needs further research.
Keywords/Search Tags:time online evaluation, density based clustering, distributed computing, Spark
PDF Full Text Request
Related items