Font Size: a A A

Reasearch On The Telecommunication Complaint Text Clustering Based On Improved CFSFDP Algorithm

Posted on:2018-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:T Y ZhangFull Text:PDF
GTID:2348330515462874Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Over the past decade,the maturity of information technology has led to the rapid development of mobile Internet,and then emerged out of the tens of thousands of network data.Most of these network data exist as the form of text.How to effectively manage and analyze text data involves the research of text processing technology.As an important topic of text processing technology,text clustering has important applications in document management,information retrieval,data mining and so on.With the continuous reform and development of telecom enterprises,the diversification of telecommunication services has attracted a large number of customers.However,the imperfections of telecommunication services have also led to more and more complaints from user.Using text clustering technology dealing with complaints text processing can convenient telecommunications operators to analyse the reason of complaints and made a bad business processing countermeasure,so as to improve the quality of telecommunication service and enhance enterprise competitiveness.A new density-based clustering algorithm is proposed by Alex Rodriguez and Alessandro Laio in 2014,namely CFSFDP(Clustering by Fast Search and Find of Density)algorithm.The CFSFDP algorithm selects clustering centers by simple distance and density product values.Aimed at the deficiency of the clustering center selection strategy,this paper proposes a CFSFDP algorithm based on weighted.The weighted CFSFDP algorithm increases the importance of distance values when selecting clustering centers,which improves the accuracy of clustering centers.The CFSFDP algorithm based on weighted and CFSFDP algorithm applied in telecom complaints text,prove the validity of the improved algorithm.Based on the analysis of the problem that the clustering center is selected by the distance and density product value,this paper proposes a CFSFDP based on differential evolution algorithm.To reduce the effects on the method by random select of optimal density and distance threshold for CFSFDP,the method searches density threshold and distance threshold by differential evolution algorithm.Experiments on datasets of telecom complaints text show that clustering result of CFSFDP based on differential evolution algorithm is better than K-Means algorithm,CFSFDP algorithm,CFSFDP algorithm based on weighted and agglomerative clustering,the algorithm is effective.
Keywords/Search Tags:text clustering, telecom complaints, density, distance, weighted, differential evolution
PDF Full Text Request
Related items