Font Size: a A A

Research On Micro-blog New Words And New Words Emotional Orientation

Posted on:2017-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:X Z BianFull Text:PDF
GTID:2348330488966911Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Some of the topics discussed in today's social networks tend to reflect the direction of the current public opinion. Micro-blog information is much and miscellaneous, how to get useful information from the massive data has higher requirements for Natural Language Processing technology. For example, opinion mining is to extract users' attitudes or emotions on current affairs, hotspots and products. The information can be used for government and corporate decision-making policy support. As a product of the new era, new words are loved by people for its simple character, and they are widely used. The emotional tendency of new words affects the results of text sentiment analysis. So this thesis makes a research on the new word discovery of social network and the emotional tendency of new words.To complete the above tasks, word segmentation is a basic requirement. The accuracy of segmentation directly affects the results of the above analysis, and new words are the main reason to affect the accuracy of the segmentation. Therefore, the discovery of new words can be used as one of the basic tasks in Chinese natural language processing field. Researchers have carried out a lot of researches on the discovery of Chinese new words and have achieved some results. In new word finding, this thesis analyzes and summarizes the rules of Chinese word formation, and probability statistics knowledge. By using improved mutual information, the degree of coagulation and statistical combination method, a new word discovery system is implemented. After the discovery of new words and the new words are added to the word segmentation system as a user dictionary, the performance of the word segmentation system is improved. This thesis proposes a new method of sentiment orientation of new words based on word vector. Namely the text word is converted to word vector to obtain the average value of new words and sum of all distance of the dictionary with all annotated emotion word vectors. If the value is greater than the predetermined value, the emotion tendency is positive, otherwise, it's negative.The new words finding method in this thesis can effectively find new words, and greatly improve the accuracy of the word segmentation system. The emotional tendency of the new words can judge the emotion tendency of the new words, and the method is very portable.
Keywords/Search Tags:Web crawler, New word discovery, Word vector, Emotional judgment
PDF Full Text Request
Related items