Font Size: a A A

Mayor Public Telephone Chinese Text Labels Established

Posted on:2011-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:X M ZhangFull Text:PDF
GTID:2208360305473358Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of computer networks career and continuously improvement of people's consciousness of suffrage and self-protection, information processing turns more and more important for us to get useful information, lots of cities have established mayor's public access lines,therefore,the government and institutions accumulate a large amount of documents everyday. If we adopt manual classifier to tackle the work, the efficiency will be too low to deal with many new problems, especially with the adjustment of government functions, it's very urgent for us to find a method on text categorization timely and exactly to meet novel institutions.Automated text categorization is one of the hotspots and key techniques in the information retrieval and data mining field, text categorization based on Machine Learning, the automated assigning of natural language texts to predefined categories based on their contents, is a task of increasing importance.The paper based on the practical problems in ChangChun mayor's public access line project introduces the definition of automated text categorization, the system of mayor's public telephone, it also gives a summary and research to several key techniques about text categorization, including Word Segmentation., Feature Se-lection,Feature Extraction, maining discusses how to label the documents based on semi-supervised learning,including EM algorithm, random forest, boosting al-gorithm. We use C++language to implement the three classified approaches and analyse the results.
Keywords/Search Tags:categorization, mayor's public telephone, semi-supervised learning
PDF Full Text Request
Related items