Font Size: a A A

The Research Of Chinese Web Text Orientation Classification

Posted on:2009-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:D L DanFull Text:PDF
GTID:2178360242474637Subject:Information networks and security
Abstract/Summary:PDF Full Text Request
21th century is the times when information increases explosively. More and more information exists in the way of electro-document along with the quick development of Internet, and most of these documents are unorderly. Document automatically classifying technique can solve the unoderly information problem, organize and manage the information effectively and help the users acquire the information which they need quickly, accurately and comprehensively. Documents' orientation text classification is a research hotspot and an important aspect of network consensus security.Some conventional documents classification systems classify the documents by their content, such as: military affairs, medicines, sports. These systems can not identify documents' orientation. In order to maintain the network's security, we develop the Web documents' orientation classification system. Its main task is analyzing the content of documents to classify them by their orientation. It can identify the support of some network consensus and has an important effect of maintaining the network consensus security.This thesis discusses some core techniques used in document orientation classification, and analyses the advantages and disadvantages of them. It Provides logical documents. According to the accuracy and convenience principle, use C# to complete the text participle module based on four-word hashtable dictionary and its text participle accuracy is past 90 percent. We use commendatory and derogatory character pick-up technique, vector space model structure technique and SVM to implement the systems' documents orientation classification function and its classification accuracy is past 80 percent. At last, we analyze the result of text participle and text orientation classification and bring forward the disadvantages of the system and the future research orientation.
Keywords/Search Tags:Text Orientation, Text Model, Participle Dictionary, Chinese Participle Mechanism, Text Similarity, Character Pick-up, Text Classification
PDF Full Text Request
Related items