Research On Multi-label Text Classification Methods Based On Rough Sets

Posted on:2017-04-24

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhang

Full Text:PDF

GTID:2348330512451230

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the widespread use of a variety of e-commerce platform and social media sites,many assessing information about the product performance and expensed experience has been accumulated on the Internet.The data imply the mode of consumption behavior for users,and the limitation of service information for businessmen.The analysis and exploration for such data have the important practical significance for analysis of user consumption behavior,decision-making of electronic business and improvement of the marketing strategy.Text mining is an important branch in the field of data mining,while the traditional single tag supervised learning methods are difficult to meet the demand of diversity of text information processing.Therefore,for text mining,the study of multi-label text classification method and the reasonable use of multi-label learning method to deal with all kinds of text data has great significance.As a kind of effective tool for dealing with uncertain information,rough set theory has made a lot of research results in the aspects of learning the rules of the classification and attribute reduction.Aiming at the practical application,the web document classification and aspect mining of comment,we propose the research of multi-label text classification method based on rough set theory.The main research contents and conclusions are as follows:(1)Building and analysis of the experimental corpus for multi-label textWe select a large number of web page documents and auto product reviews data as experimental corpus.After the corpus is processed by text mining methods,we build the Chinese multi-label text datasets.At the same time,aiming at the problems to identify more performance for comment text,we propose a kind of identification framework based on multi-label learning.(2)Multi-label text classification based on robust fuzzy rough set modelOwing to the uncertainty of multi-label data and noise data,a novel multi-label robust fuzzy rough classification model is proposed.The model is an extension of k-mean robust statistics fuzzy rough classification model that is used to solve the single label classification problem.Firstly,for each unlabeled instance,the membership with respect to each label is obtained by similarity measures.Secondly,according to the membership,the degree of correlation is defined.Finally,an appropriate threshold is given to demarcate the correlated and uncorrelated labels.On real multi-label text datasets,experimental results indicate that the proposed model is outstanding in multi-label classification for web page text.(3)Chained multi-aspect recognition method with label-specific features based on rough setsAiming at the evaluation of characteristics of the multi-aspect performance appeared in the automotive product reviews,we propose a chained multi-aspect recognition method with label-specific features based on rough sets.Through of extracting exclusive features for every label and building exclusive feature classifier chain,we can solve multi-aspect identification problem in this way.In the Sina car review corpus,compared with a variety of multi-label classification methods,the subset accuracy of the proposed method reaches up to 95%.Hence,our method was feasible for recognizing the multiple aspects of automobile reviews.

Keywords/Search Tags:

Multi-label text categorization, Feature selection, Rough sets, Aspect recognition of product

PDF Full Text Request

Related items

1	The Research On Label Enhancement-Based Multi-Label Feature Selection Algorithm With Fuzzy Rough Sets
2	Research On Text Categorization Based On Support Vector Machine
3	Research On Feature Selection Based On F-neighborhood Rough Sets
4	Research On The Feature Selection Technique For Text Categorization
5	Research Of Categorization Algorithm Based On Rough Sets Theory Attributes Reduction
6	Research On Feature Selection With Fuzzy Rough Sets
7	Based On Rough Set Text Automatic Classification Study
8	Feature Selection Algorithm For Multi-label Learning
9	Research Of Attributes Reduction And Samples Reducding Algorithm Based On Neighborhood Rough Sets And Application In Text Categorization
10	The Research And Application Of Rough Set In Text Categorization System