Font Size: a A A

Spam Message Filtering System Based On MCNN And BiLSTM

Posted on:2020-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:L N ChenFull Text:PDF
GTID:2428330590471582Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development and popularization of mobile phones,SMS service has also risen rapidly.Although it is currently impacted by social platforms,SMS is still an indispensable medium for people's daily communication because of its low price,convenient reception and real-time performance.More and more companies are advertising in the form of text messages to increase the impact of their products.On the one hand,SMS has brought convenience to the daily life of the people.On the other hand,the abuse of spam messages has always plagued the lives of the people and caused certain harm to the harmonious society.In order to create a clean and good SMS communication environment for users,it is necessary and urgent to research and filter spam messages.The focus of this thesis is to use deep learning model and text categorization related technology for short message filtering.First of all,at the input of the model,for the special data of SMS,the noise content information is matched and replaced with the normal text content to lay the foundation for the later operation such as feature selection.Aiming at the disadvantage of traditional TF-IDF feature selection algorithm which neglects the distribution information of feature words among different classes in a certain category,improvement is made on the basis of TF-IDF algorithm.To solve the feature sparsity problem caused by short text in SMS,a feature expansion method based on word vector is used.The feature reduction method is used for long text in SMS,which not only avoids the problem of sparse feature of short text in SMS,but also reduces the consumption of computer hardware and software resources during model training.Then,BiLSTM-MCNN is used for model training.the BiLSTM-MCNN model fusion method not only increases the richness of semantic features,but also obtains the long-distance dependence features of text messages.MCNN is an improved model based on convolution neural network.Finally,the final result is obtained by using deep learning model combined with customized SMS features.The training set,validation set and test set used in the experiment are all data from users' real life.In order to prove the effectiveness of the improved algorithm and the deep learning model,several experimental comparisons are made.The evaluation standard is accuracy,recall and F1-Score.The final results showed that the feature selection algorithm and the deep learning model proposed in this thesis could improve the performance of the spam SMS filtering system.Finally,SMS filtering results are presented using Flask framework.
Keywords/Search Tags:SMS filtering, text classification, feature extension, feature reduction, deep learning
PDF Full Text Request
Related items