Font Size: a A A

Research And Implementation Of Multi-label Classification In Chinese Text Based On Bayes Algorithm

Posted on:2018-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:D Q KeFull Text:PDF
GTID:2428330542968205Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present,all kinds of information resources on the Internet are increasing day by day,which bring convenience to people,but also the distress to search resources.As a hot research field of machine learning and data mining in computer science and technology,text categorization technology has attracted more and more attention.Multi-label classification technique in texts can be applied to classify and relate text information to related categories or topics quickly and accurately,which helps people quickly locate the desired content in the mass of information resources,and brings important practical significance.This thesis studied the multi-label classification problem in Chinese text.Firstly it took a literature investigation,got familiar with the basic concepts and evaluation methods of multi-label classification.It carefully studied the concrete ideas of several existing common multi-label classification algorithm,compared and analyzed the advantages and disadvantages of each algorithm.It got familiar with the definition and the general process of text categorization.And it discussed the feature selection of Chinese text.Secondly based on the previous literature research,it chose the traditional Bayes classification algorithm,which was adapted and applied to multi-label classi-fication in Chinese text.And it made a detail design of the algorithm steps,including feature selection of Chinese text and the concrete adaptation methods of Bayes classification algorithm.Finally it used the Java programming language,with the help of the third part library and related experimental tools,to complete the detailed design and implementation of the algorithm experimental program,and made the test of experimental program by the test data sets.Then it evaluated the algorithm through the analysis and comparison of the test records in different ways.The experimental results in this thesis showed that the multi-label classification algorithm adapted by Bayes algorithm is an efficient and feasible algorithm in Chinese text multi-label classification problem,and is expected to be applied in the related field of information processing.
Keywords/Search Tags:Multi-Label Classification, Text Categorization, Bayes Algorithm
PDF Full Text Request
Related items