| With the development of the Internet, network has become an important way forpeople to express their views and attitudes. Network public opinion plays an importantrole in public opinion. The research on network public opinion is attracting more andmore attention of both academia and industry, and can be extended into severaldirections, such as sensitive word analysis, opinion classification, and sentimentanalysis. Sentiment analysis is the field of study that analyzes people’s opinions,sentiments, and attitudes towards entities such as products, organizations, and events. Inother words, the goal is to determine whether a sentence or a text expresses a positive,negative or neutral attitude. Sentiment analysis can dig out people’s attitudes to socialevents or products, forecast the trend of events, and improve the accuracy ofinformation filtering.Generally, there are four types of websites which can carry opinions: news sites,blogs, forums and microbloggings, and different types have their unique characteristicsin both form and content. In this thesis, sentiment analysis algorithms and solutionsbased on these characteristics are proposed and improved, and then a network publicopinion sentiment analysis system is designed and implemented. The main content is asfollows:1. A Chinese sentiment lexicon expansion method based on translation is proposed.2. For news and blog texts, a sentiment classification algorithm based on supportvector machine (SVM) is used. The news and blog texts are divided into positive,negative and neutral three categories.3. Forum texts are divided into two categories, the main posts and replies. For themain posts, the sentiment classification algorithm based on support vector machine isused. For the replies, a sentiment analysis algorithm based on fuzzy matching andsentence’s emotional value weighted calculation is proposed.4. For microblogging texts, a two-step classification method is used. First, aclassification algorithm based on weighted calculation is proposed to classifymicroblogging texts into two categories, subjective and objective. Then the subjective texts are classified by the sentiment classification algorithm based on na ve Bayesclassifier (NB).5. On the basis of the above study, combined with the web crawler technology, webpage content extraction technology and the web technology, a network public opinionsentiment analysis system (NPOSAS) is designed and implemented.The test result shows that the system gets71.7%accuracy for news texts,69.3%for blog texts,64.0%for forum texts, and65.1%for microblogging texts. |