Font Size: a A A

Research On Real-time Topic Detectioin And Dynamic Recommendation Over Micro-blog Data Stream

Posted on:2018-10-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:N SuFull Text:PDF
GTID:1368330578471848Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet,especially the arrival of Web2.0,users can share and provide various type data through websites in real time.These websites provide abundant choices for users.However users have to spend more time to find the needed information.In addition,with the incensement in the number of Internet users,social networks become the information carrier,such as micro-blog and forums.Micro-blog can usually respond to events precede the traditional news media beacause people's social activities,breaking news and hot topic propagation on micro-blog spread fast.This thesis presents a method of topic detection and personalized recommendation on micro-blog stream.In order to ensure the real-time and accuracy,key technologies are studied such as frequent items mining,clustering analysis and attribute reduction on data stream.The main contributions are as follows:1)An arbitrary shape clustering algorithm based on variable density data stream VDStream 1s proposed.It can realize the on-line maintenance using synopsis structure.The clustering method based density can find the arbitrary shapes clusters.It 1s not sensitive to parameters.Moreover,it can find clusters more accurately if the density of data stream is different or sparse.2)The approximate high-frequency words can be found in less memory and less time by frequent items mining CCK.The size of windows is dynamic adjusted with the duration time of the topics.Hot topics and new topics can be detected timely.3)The conditional information entropy is used as the heuristic function and the chi square statistical detection method is used to reduce the computational co1plexity.An incremental attribute reduction on the dynamic data set is presented by preserving the intermediate variable.4)Proposed an improved collaborative filtering recommendation algorithm based on users clustering.Time decay factor is introduced to strengthen the importance of the items scored recently because user's interest may drift with time.Moreover,three factors are considered which are the heat of topic,the interests of similar user and users being followed.A dynamic filtering recommendation algorithm for microblog is proposed by preserving the intermediate variables.
Keywords/Search Tags:data stream, cluster analysis, frequent item, topic detection, recommendation system
PDF Full Text Request
Related items