Font Size: a A A

Research On Unsupervised Parsing Based On Transformer Neural Network

Posted on:2022-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:E L LiuFull Text:PDF
GTID:2518306731472484Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present,the growth rate of information technology is reaching an unprecedented height.While people enjoy the convenience brought by information technology,a large amount of data will be produced every day.However,how to extract effective information from the abundant redundant information has become a very important problem.At the same time,among these data,text information data is the most common and the most difficult information for computers to understand,so natural language processing technology as a very key and very challenging direction arises at the historic moment.In the field of natural language processing,Constituency parsing is one of the most important basic tasks,which is closely related to the computer's understanding of human language.Constituency parsing is widely used in many downstream tasks,such as question answering system and sentiment analysis,etc.Accurate constituency parsing is of great significance to text processing technology.Constituency parsing aims to extract a constituency-based parse tree from a sentence that represents its syntactic structure according to a phrase structure grammar.At present,most of the methods of parsing are mainly based on the methods with supervised data by improving the decoding model to improve the effect.Different with the current methods,this article on the basis of the Transformer model joined to the syntactic structure of implicit modeling module,thus joined the sentence structure in the process of encoding information,at the same time,based on the reduction of word order training methods to improve the training methods of the model,makes the model has better adaptability and better effect,in this paper,the main work is as follows:(1)based on the thought of adding syntactic information encoding process for unsupervised coding model made certain improvement,before long the attention layer adds a new syntactic components in the sentence of prior knowledge to calculate module-attention layer structure,the calculation of syntactic constituents prior knowledge is used to represent a word set(sequentially comprises two or more than two words)can form a phrase.This structural attention layer is equivalent to the implicit modeling of sentence structure.In essence,it is the addition of a module specifically for learning syntactic structure information,so that the structural information of both phrases and sentences are taken into account when calculating the similarity between words.At the same time,a certain level constraint is added between the layers,so that the hierarchical information in the sentence can be automatically learned in the coding process.(2)On the basis of the original mask language model training mode of training is improved,due to considering the original forecast mask word downstream unsupervised training methods and the correlation of syntactic analysis task is not particularly big,therefore joined the reduction associated with sentence structure information more word order training methods,before the training prior to the operation of the certain phrases to disturb,finally let model reduction phrases correctly in order to improve the robustness of the model and at the same time in the ascension of syntactic analysis of the effect of reducing the confusion of language model.In this paper,a series of experiments are carried out on Chinese Tree Bank version 9.0,an authoritative data set publicly available in this field,and the confusion degree,accuracy rate,recall rate and F1 value widely used in the field of unsupervised parsing are adopted as the evaluation indexes for the model performance.The experimental results show that the method proposed in this paper has good performance compared with the common methods in this field,and can effectively improve the effect of syntactic analysis.
Keywords/Search Tags:Natural Language Processing, Language model, Transoformer neural network, constituency parsing
PDF Full Text Request
Related items