Research On Hybrid Embedding In Text Classification

Posted on:2021-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:J J Lai

Full Text:PDF

GTID:2428330626961131

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

With the advent of word2 vec algorithm,Embedding technology has been well applied in text classification,but Embedding also has its shortcomings.In the traditional Embedding algorithm,a word can be expressed by an embedded word vector,but this word vector cannot cover all the meaning of the word.This is due to the nature of word2 vec algorithm training.Therefore,in order to overcome and modify to a certain extent or increase the word vector information of Embedding,this paper proposes a hybrid Embedding method.This paper uses three different corpora pre-trained Embedding to study mixed Embedding.Different corpora have overcome the problem of single representation of word vectors to a certain extent.To a certain extent,they can capture multiple meanings of word vectors.There are two main operating methods for hybrid Embedding.One is to stack different pre-trained expected word vectors,and increase the effective information of Embedding by expanding the dimension of the word vector.The second method is to use a linear fusion method of pre-trained word vectors,and to increase the effective information of Embedding to some extent through the fusion of the amount of word vectors.The experiments in this article mainly introduce the way of word vector stacking.By stacking pre-trained Embedding in pairs,under the same neural network architecture,mixed Embedding can effectively improve the model's evaluation index F1-score and AUC.

Keywords/Search Tags:

text classification, word2vec, hybrid Embedding, neural network

PDF Full Text Request

Related items

1	Research On Chinese Text Classification Based On Hybrid Neural Network Model
2	Research On Chinese Text Classification Based On Deep Learning
3	Research On Text Classification Based On Word2vec And Convolutional Neural Network
4	Research On Text Classification Based On Multi-factor Features
5	Research On Text Classification Algorithms Based On Word Vector
6	Research On Text Classification Based On Word2vec Word Vector
7	Research On Text Classification Method Based On Bidirectional LSTM
8	Research On Chinese Short Text Classification Based On Word Embedding
9	Research On Emotion Classification Of Network Short Texts Based On Deep Neural Network
10	Research Of Text Classification Based On Word2vec And Self-attention