Font Size: a A A

Feature Analysis And Recognition Of Hype Micro-Blog

Posted on:2017-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:E X WangFull Text:PDF
GTID:2308330485978205Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of the number of micro-blog’s users, the phenomenon of micro-blog speculation is becoming more and more serious. "Network navy" and "Internet marketer" spread rumors and false information by micro-blog, which seriously disturbs the order of the network. At present, the study on hype micro-blog is mainly in the fields of micro-blog asking politics and communication ethics, and the deep recognition of hype micro-blog is less. Therefore, to extract the typical characteristics and construct efficient classifiers of hype micro-blog have become the research focus of this thesis.Feature extraction and classification algorithm have a great influence on the accuracy of micro-blog recognition. Firstly, this thesis analyzed and extracted the micro-blog’s typical features. Then, based on the algorithm of classical SVM (Support Vector Machine), combined with PSO (Particle Swarm Optimization) and GA (Genetic Algorithm), PSO-SVM and GA-SVM classifiers were constructed, which are used to optimize random selection of parameter in SVM, in order to choose the optimal classifier to identify hype micro-blog. The specific contents are as follows:First of all, the thesis describes hype micro-blog’s research background, research status at home and abroad, and the research content.Secondly, the traditional classification algorithm principle of support vector machine and the choice of kernel function are introduced, and the optimization principle of particle swarm algorithm and genetic algorithm are described, based on which to optimize the SVM kernel parameter and penalty factor.Thirdly, based on the micro-blog information gathered by crawling tools, the characteristics of micro-blog are analyzed by drawing the propagation diagram, scattered diagram and the cumulative distribution function curve with MATLAB and Pajek. Then, the modularity and average shortest path are extracted by using the Fast-Newman algorithm that is improved based on traditional module algorithm and Floyd algorithm. To solve the problem of interference on the accuracy of hype micro-blog recognition caused by celebrity effect, the key attributes of the user are analyzed further and extracted.Fourthly, aiming at the random and time-consuming in the choice of parameters of traditional SVM, the classifiers of PSO-SVM and GA-SVM are constructed by respectively using PSO and GA algorithm to optimize the parameters of SVM, and by comparing and analyzing the classifiers, a classification model with high inspection accuracy is obtained after training the classifier with feature vector.Finally, the thesis defines the performance evaluation index of the classifier, compares and analyzes the experimental results. The experimental results show that by using the PSO-SVM classifier, combined with six-dimensional feature vector extracted from micro-blog, the interference on the accuracy of hype micro-blog recognition caused by celebrity effect can be overcome well, and the hype micro-blog can be identified effectively. The final classification accuracy rate can reach above 90%, the false alarm rate is less than 1%, and F1 value is more than 90%. It is concluded that the effect of PSO on the parameters optimization of SVM is obvious, and PSO-SVM classification model is better, which can effectively solve the problem of recognition of hype micro-blog.
Keywords/Search Tags:Hype micro-blog, Support vector machine, Particle swarm algorithm, Genetic algorithm, Celebrity effect, Feature vector
PDF Full Text Request
Related items