Font Size: a A A

Text Matrix Model Oriented To Professional Fields

Posted on:2013-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q ZhengFull Text:PDF
GTID:2248330395475295Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As information technology advances and the increasing popularity of the Internet, humanity is putting the existing information about the real world such as newspapers, books into network and the entire network is stacked into an unprecedented super-huge database.How to obtain the required information from the vast sea of information space quickly has become one of the most fundamental problems in the new information age. Text classification is based on the text contents and categorizes text automatically which can help people better grasp the text, mine text, and improve the quality of information services.The primary problem of text classification is how to change text data into mathematical data.Nowadays, most text classification algorithms choose Vector Space Models to represent text. But this method takes a single word as the feature item, which ignores the text structure. The ability of the words in different location method for the text is different that lead to lots of redundant words in text and of course severely reduced the accuracy of text information processing at the same time and also lead to high-dimensional sparse problem. These problems have greatly affected the speed of text classification. The main research study of this thesis:propose a new text classification algorithm-Text Matrix Model, which is used in professional fields.Based on the study of the above text usually representation method, this paper puts forward the text matrix model, into which structure features are introduced. Based on vector space model of the above said, the location of an item as important evaluation parameter of the text feature item s, establishes text representation model based matrix. Experimental results show that the new algorithm can improve the accuracy.
Keywords/Search Tags:Text Matrix Model, TFIDF, Text Representation Model, TextClassification
PDF Full Text Request
Related items