Font Size: a A A

Research On Source Tracing Methodof Spread SMS Based On Clustering Analysis

Posted on:2017-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:J L GanFull Text:PDF
GTID:2348330503472473Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As a kind of simple and practical tool of communication, while Short Message Service(SMS) brings tremendous convenience to our daily life and work, it also provides a convenient way for bad guys to disseminate ill-meant information. Due to huge amount and heterogeneity of SMS data, manual inspection of SMS data to identify SMS propagation path is time-consuming and often futile. To alleviate the conflict between the unacceptable time complexity and the necessity to curb the ill-meant spread of illegal information and inciting gossip, this research project is aimed to cluster SMS data and provide a method to pinpoint unwelcomed SMS text source in an efficient manner.To backtrack the source of disruptive short messages, the method proposed analyzes the similarity of SMS text pairs, clusters SMS data, eliminates candidates which are less likely to be ill-meant SMS texts and restoring the propagation paths by constructing directed graphs.Considering the short length, informality and mutability characteristic of SMS texts, the method proposed calculates the similarity between SMS text pairs based on sentence-wise text edit distance. Experiments were conducted to compare it with existing methods and the results obtained demonstrate its effectiveness.K-mediods algorithm is adopted to cluster similar SMS texts into subgroups. To address the issue regarding to sensitivity of initial solution and tendency to produce sub-optimal solution of K-mediods algorithm, improvements are suggested to optimize the selection of mediods.Experiments were conducted to compare it with existing methods and the results obtained demonstrate its effectiveness.Last but not least, features of clusters obtained via previous steps are analyzed to eliminate SMS text candidates which are less likely to be ill-meant. Directed graphs are constructed to visualize propagation paths of these SMS texts thus accomplish the task of SMS text source backtracking.
Keywords/Search Tags:SMS text, text similarity, K-mediods, SMS backtracking
PDF Full Text Request
Related items