Application Of Weak Supervised Learning On Text Classification

Posted on:2021-04-24

Degree:Master

Type:Thesis

Country:China

Candidate:S S Liu

Full Text:PDF

GTID:2428330632963026

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Text classification is a basic and important task in natural language processing.With the rapid expansion of network information,text classification can solve the problem of information clutter to a certain extent,which is conducive to the accurate acquisition and application of information.The application of neural network model to solve the problem of text classification has achieved good results and is widely used,but the lack of training data is still the key bottleneck of their application in many practical scenarios.In fact,training a text classification model with good effect and strong generalization ability usually requires a million level of marked corpus.To collect such training data,experts and scholars in relevant fields need to read millions of documents and use domain knowledge to mark them carefully,which is too expensive and difficult to achieve.In addition,researchers often face the situation of only a small amount of labeled data.Therefore,how to effectively use dimensionless data for text classification has become an important research direction in natural language processing.In view of the current situation of weak supervised learning in text classification,this paper attempts to use "self encoder" and "cooperative training" based weak supervised text classification methods.The two solutions correspond to the two kinds of models respectively.The first model uses self encoder to learn unlabeled data.In the training stage,the hidden layer neurons of self encoder compete with each other to guide self encoder to pay more attention to the features that are more guiding for text classification.In general,the model can learn the text features that are meaningful for classification.The second idea is to propose a semi supervised text classification idea based on collaborative training,and to optimize the task of semi supervised text classification by collaborative model and collaborative rules.The experimental results show that the proposed method can effectively utilize the dimensionless data and improve the performance of text classifier.Using the method of weak supervision to solve the problem of scarce marked data in text classification can save manpower and material resources,make full use of unmarked data,and greatly reduce the cost of manual marking.In addition,the method of weak supervision can be extended to other tasks,which can also provide some reference and inspiration for the major tasks of deep learning,and has high value and significance for solving the problem of the scarcity of marker data in deep learning.

Keywords/Search Tags:

Text classification, Weak supervision, Autoencoder, Collaborative training

PDF Full Text Request

Related items

1	Research On Method Of Short Text Sentiment Classification Based On Weak Supervision
2	Research On Text Representation And Text Classification Method Based On Adversarial Training
3	Text Classification Based On Deep Transfer Learning
4	Research On Question Classification Based On Weak Supervision And Deep Learning
5	Research On The Application Of Text Classification And Clustering In Network Secutiry Operation System
6	Research On Hybrid Recommendation Algorithm Enhancement By Stacked Denosing Autoencoder And Users' Labels
7	The Theory And Application Research Of Deep Autoencoder
8	The Research On Local Smooth Preserving Of Manifold Regularization Auto Encoder For Text Representation
9	Text Classification Based On TF-IDF Matrix And Caps Net
10	Research On Semi-supervised Image Classification Based On Collaborative Training And Active Learning