Font Size: a A A

Research And Application Of Automatic Text Categorization Based On Content Management

Posted on:2010-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360275955780Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of the internct and information technology,human society has entered the information age now.In the process of informatization,enterprises produced a large amount of data and information resources.These vast amounts of unstructured content need to be managed as well as the structured data.Enterprises have an urgent demand of a method to organize,manage and make use of all the contents scientifically and efficient,to meet the growing variety of business applications,to improve efficiency in the management of information resources,and to enhance their competitiveness.Content management came into being in such circumstances.Overseas analysis shows that content management will become the next hotspot of software market competition.Although the prospects for content management are optimistic,there are still many technical problems to overcome.For example,in the session of the content distribution, a classification system is usually needed to make users find and browse easily.From this point of view,by taking unstructured Chinese text as the researching object,this thesis studies the application of the automatic categorization technology based on the content management system.Firstly,domestic and overseas research and potential problems are summarized respectively in detail.On the basis of that,the scope and objective of this thesis are presented.Secondly,with the view of the process of automatic text categorization (ATC),several key technologies of each procedure are further studyed.Thirdly, according to those key technologies,a series of experiments are designed to compare the classification performance of different categorization algorithms and feature selection algorithms,and to determine the value of the parameters in some of the algorithms.In addition,a method for optimizing the training set is raised and then verified by experiment.Finally,on the backgroud of a practical project,an ATC system prototype based on content management system is designed.The experimental conclusions are applied to the actural project.Based on the project of Passenger Transportation Security Supervision System and its data,the thesis studies how to solve the categorization problem in the content management system by use of the optimal algorithms verified by experiments.At last, the categorization result is presented,which is of great practical value.
Keywords/Search Tags:Content Management, Automatic Text Categorization, System Prototype
PDF Full Text Request
Related items