Font Size: a A A

Research On Short-Text Message Clustering In Instant Messaging Exchange

Posted on:2017-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:L K CaiFull Text:PDF
GTID:2348330566456639Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and mobile application,people's demand for information acquisition is increasing.On the other hand,it's quite difficult to find useful information from large and full of noise data rapidly and accurately.As the most widely used and the most frequently used internet application,instant messaging create a large amount of data with non-standard grammar and sparse keywords.How to find useful information from these data rapidly and accurately is a problem.Firstly,we analyzed the general process of text clustering and compared the domestic and international development of text clustering in this paper.The introductions of features of short-text in IM and key technologies in short-text clustering are summarized.Secondly,to solve the problem of text similarity bias caused by sparse keywords,a dynamic text similarity calculation method based on HowNet is presented which improve the semantic similarity calculation method of HowNet.Thirdly,to solve the problem of non-standard grammar and few information in a single message,a multifactorial dialog extraction method is presented,in which we extract dialogs from messages.Fourthly,in k-means it's difficult to determine the initial value k.Considering that,the Apriori frequent item-set mining algorithm is presented,which use the result of frequent item-set to calculate the initial centers of clusters.Finally,experimental results show that the clustering method adopted in this paper has a certain improvement in accuracy and speed.On the programming of experimental verification in this paper,a data visualization system concerned on short-text of instant messaging is designed and accomplished.The system is composed of storage module,support module,clustering module and visualization module.The system is designed with intelligent operation and display clustering result,user information and message information in graphs.
Keywords/Search Tags:Instant Messaging, Short-Text, Text Clustering
PDF Full Text Request
Related items