Font Size: a A A

Text Emotional Classification Based On Text Mining

Posted on:2017-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z XuFull Text:PDF
GTID:2278330485450734Subject:statistics
Abstract/Summary:PDF Full Text Request
With the progress of information technology, the mass of data have appeared in the world, which is Big known as big data. Big data contains a large number of value. Financial industry and the rapid development of internet technology produce a large number of financial related data on the internet: the financial text data stored in the internet. Through the analysis of these types of data to extract the value is the trend of the times.Text mining is an effective method for the analysis of the data with type of texts. Text mining includes the collection of text data, text information extraction, text data modeling etc. the research contents, this paper uses text mining technology focus on the classification method on a group of practical often appear the moods’ classification of "shares" such as the text data of comments and posts. For the analysis of text data in the "shares", the conversion from the unstructured data to structured data and modeling approach is used in this paper, the most important is that how to get a group of the feature of text data and share it information corresponding to and based on this data were modeled through classification.For the modeling of the texts’ feature data, this paper’s modeling methods based on "data driven", using the text feature data set, through cross validation experiments, looking for suitable classification model, and proposed based on cross validation results with the test of nonparametric statistical test methods, evaluating these model generalization ability, and from select suitable generalization ability, steady model to classify the text feature data, thus completing the "shares" classification of text data. This paper presents a classification model and multi fold cross research results based on using the applicable examination in the paired data test of nonparametric test method on whether the difference of two model is robust dual model generalization capability comparison method is proposed. This is also the innovation of this article.Firstly, in the first chapter of the paper shows research background, research questions, research contents and the significance of the research; secondly, the paper presents the modeling ideas and is presented in this paper in the course of the study, theory of the classification model, theory of dimension reduction, unbalanced data classification modeling theory, and multi fold cross validation theory and nonparametric test theory; thirdly. Modeling the practical "shares" text feature data and analysis, draws the conclusion. Finally, this paper shows the research conclusions and shortcomings of this paper, and the prospect of future research directions.
Keywords/Search Tags:Text classification, Feature of text, Unbalanced data, Dimension reduction, Cross validation, Nonparametric hypothesis test
PDF Full Text Request
Related items