Font Size: a A A

Research On Text Emotion Classification Based On Rough Set

Posted on:2020-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:F HanFull Text:PDF
GTID:2428330575965495Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of social informationization and artificial intelligence,a large number of data are constantly generated,which contains a variety of valuable information worth exploring and researching.In the process of data processing,how to classify data effectively is an important issue.Users comment and express their opinions on events through social media such as tremolo,microblog and Zhihu.However,some comments are vague and can not effectively identify their emotional orientation(positive or negative)only for words.Rough set can just solve the emotional classification problem of vague semantic texts.On the basis of rough set theory,this paper extends and proposes a text emotion classification method based on rough set.The specific research contents include:(1)Establishing text decision table.In this paper,an improved DF-based text feature selection algorithm is proposed.Through feature selection of text with emotional words,the tendency intensity of emotional words is calculated,and the method based on word association degree is applied to weight calculation.An emotional text decision table is established by combining the text feature words and the emotional tendency intensity of the words.The algorithm can reduce the dimension of the original text features and classify the text by decision table.Experiments show that the decision table with emotional weight can improve the accuracy of classification.(2)Discrete decision table.In this paper,a new discretization method based on minimum breakpoint partition is proposed to discretize the decision table of emotional text with the weight of emotional words.The main work is to find the minimum breakpoint in the discretization.Through this discretization method,the redundancy of text decision table can be effectively reduced and the classification accuracy can be maintained unchanged.(3)Attribute reduction of decision table and confidence classification of rough decision.It is a NP-hard problem to find the complete attribute reduction of decision tables.Based on cleaning the redundant data of original decision tables,this paperproposes a fast attribute reduction algorithm for decision tables,which reduces the attributes of text decision tables and improves the classification ability after reduction.The reduced decision table is classified by membership function and confidence function of rough set.The reduced discrete decision table is applied to the text to be classified to obtain positive and negative text.Finally,a text categorization system is designed and implemented.Experiments show the effectiveness of the method.
Keywords/Search Tags:rough set, text classification, attribute reduction, discretization, feature selection, rough decision
PDF Full Text Request
Related items