Font Size: a A A

Text Classification Based On Deep Learning

Posted on:2022-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:X GuoFull Text:PDF
GTID:2518306722458904Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,data is filled in all corners of the Internet,among which text data is the most common data.how to effectively manage these massive text data and analyze useful information from it.It is of great significance not only for the supervision of public opinion of society but also for personal love analysis.Text classification is one of the key technologies used to process text data.It automatically gives a label describing the meaning of a serialized word,which greatly saves the cost of human labor.How to construct text features effectively is the most important step in text classification.For this,most of the existing researches focus on the feature representation of words and the context meaning of words.On the one hand,it is inevitable to explore the function relationship between words and text features,on the other hand,the relationship between text features and text features has not been clarified clearly.Therefore,the work of this paper can be summarized as follows:(1)This paper proposes a text classification method based on bi-dynamic routing with label: the relative position vector and semantic vector of words are embedded into the same vector by using relative position coding and attention mechanism.The function of word semantics and word order on constructing text features is analyzed,and the output is input into bi-dynamic routing algorithm.In this way,the designed two-way motion is adopted traditional routing is to solve the two relations of words and documents;the resulting text capsule vector is used to represent text features.Because of the probability of using modulus length to represent the existence Probability of text capsule,the potential distribution of text capsule is ignored,and the label is embedded in space to constrain the distribution characteristics of text capsule.The results show that the model is competitive,and the function of word order and semantics on the construction features is also expounded.Except for the position of zero supplement,all positions are much more semantic than word order.Therefore,the feature construction of text mainly depends on semantics,and the word order has little influence on the text features.In the position of zero supplement,the word order information is more than semantic information,that is position.The information can be used to distinguish the difference between zero padding and non zero padding.(2)In order to explore the similarity between text and text,the paper proposes a semi supervised text classification algorithm based on the interaction between text levels.In order to explore the similarity between text and text,text is encoded into a fixed feature vector by using Bert model,and the relationship between text and text is constructed according to the threshold and similarity between text features.Each node in the graph represents text and edge It represents the semantic similarity relationship between text and text.Therefore,the text classification task is transformed into the graph node classification task.On this basis,we explore whether to use document representation with similar semantics to enhance the representation of each other,that is,to use the neighbor node in the graph to represent the characteristics of the target node.Therefore,the graph convolution method is used to study the feature expression of the node.Considering the scarcity of labeled data,the semi supervised learning framework can be designed by using the advantages of node information transmission in the graph.And the validity of our model is proved by experiments.
Keywords/Search Tags:text classification, deep learning, graph neural network
PDF Full Text Request
Related items