Font Size: a A A

Text Simplification Based On Recurrent Neural Network

Posted on:2020-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:W B LinFull Text:PDF
GTID:2428330572479153Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,various Internet information such as voice information,picture information,text information,etc.are also exploding,and we are exposed to a large variety of information every day,such as from news reports,blogs,Weibo,etc.Text information for each channel.So how to make this massive amount of information quickly and efficiently analyze and process,let the machine accurately understand this information,and found that text simplification based on high semantic retention is a feasible method.The following methods and methods are used in the study of short text semantic simplification.One is based on the traditional cyclic neural network method;the other is based on the long and short time memory model;and the other is based on the time recursive sequence model..The main research work is as follows:1.The working principle of traditional cyclic neural network,its training model and efficiency,the application of text simplification in circulating neural network,the reference literature and the advantages and disadvantages of related research are studied.2.Establish the traditional cyclic neural network model,LSTM model,seq2 seq model and TRSM model respectively,and analyze the experimental principle,training methods and advantages and disadvantages of each model.Compare the models and combine several models to create the most efficient model for short text semantic simplification tasks.3.Aiming at the problems that the traditional cyclic neural network training algorithm can't deal with gradient disappearance and gradient explosion,combined with the LSTM model and seq2 seq model based on the cyclic neural network,the time recursive sequence model TRSM is proposed to process the input with relatively long interval and delay.Then,the BPTT back propagation algorithm is used to train the Chinese micro blog corpus.4.Establish three different experiments as comparisons,compare the original parameter values,change the number of training cycles and change the learning rate as the difference items,and then analyze the experimental results.The experimental results show that the micro blog text processed by the TRSM model is more concise and refined,which is more suitable for the extraction of text semantics,which greatly reduces the calculation amount,the text reduction rate reaches more than 60%,and the semantic retention rate reaches 1.8,which simplifies the large amount of information that the user has to deal with.The processed results can be better used for several key Chinese semantic processing tasks.
Keywords/Search Tags:short text information, text simplification, TRSM(Time Recursive Sequence Model), LSTM(Long Short-Term Memory) model, BPTT (Back Propagation Through Time) back propagation algorithm, Recurrent neural network
PDF Full Text Request
Related items