Font Size: a A A

Research Of Subject Classification Model Of Network Public Opinion Based On Bayesian Theory

Posted on:2015-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:H XuFull Text:PDF
GTID:2298330422488652Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet, the number of Internet users has been more andmore. A lot of people pay close attention to public opinion through the Internet. Theybrowse public opinion which they are interested in online and submit comments to releaseemotions. However, network public opinion information is multifarious, which lead tocertain blindness brought to users when their browsing Internet. At present, each big webportals, forums…etc have carried planning for network public opinion subject, but theplanning has certain degree of abstraction. Therefore, classification of network publicopinion subjects not only facilitates users to browse the public opinion news, but alsoeffectively warn of network public opinion.There have been multiple ways for classification of Chinese text, of which commonmethods include Naive Bayesian, KNN and SVM (support vector machine). This articleresearches subject classification of network public opinion by using Naive Bayesian whichhas simple structure and efficient classification. But conditional independence assumptionof Naive Bayesian limits its application scope and reduces the classification accuracy. Toaddition, this method needs to learn to modify priori information when facing increasingnetwork public opinion information. And every time it needs to be involved in all text and itis lack of flexibility.Aiming at above problems, this article optimizes classification methods of NaiveBayesian by using the incremental learning mechanism and dynamic reduction and putsforward optimized network public opinion subject classification model by combining withthe text mining technology. The research of this paper mainly focuses on following aspects:1.Information collection for network public opinion text: collecting information byusing the web crawler technology and parsing and extracting public opinion information bycombing with the HTML parser and website purification technology, improving theaccuracy of text representation of the network public opinion by using optimized featureweighting method to express network public opinion text.2.Optimizing Naive Bayesian classification method to improve the classificationaccuracy by using incremental learning mechanism and (F-λ)generalized dynamicreduction. By introducing the dynamic reduction of precision coefficientλ,(F-λ)generalized dynamic reduction reduces the number of text involved in attribute reduction, release the conditional independence assumption, reduce the computationalcomplexity and improve the classification accuracy. Naive Bayesian has solved the problemthat it needs to learn all information to modify priori information when classifyingincreasing network public opinion subjects by using incremental learning. In the process ofincremental learning, it prevents the problem that noise classification joined the originaltraining set and reduce the classification accuracy of classifiers by introducing the kind-setconfidence.3.Comparing the incremental and non dynamic reduction algorithm, incrementalclassification algorithm, dynamic reduction algorithm classification method and bothincremental and dynamic reduction classification algorithm mentioned in article via dataexperimental analysis to inspect effectivity of optimized of network public opinion subjectclassification algorithm mentioned in article and research and get to know the feasibility ofnetwork public opinion subject classification algorithm.
Keywords/Search Tags:network public opinion, Dynamic reduction, Incremental learning, naiveBayesian, subjects classification
PDF Full Text Request
Related items