Font Size: a A A

Constituency Parsing Based On Deep Neural Network

Posted on:2022-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:B Y DaiFull Text:PDF
GTID:2518306323967039Subject:Data Science
Abstract/Summary:PDF Full Text Request
Constituency parsing is an important problem with a wide range of applications in linguistics and natural language processing,which aims to extract a syntactic structure-based parsing tree from a given sentence according to a phrase structure grammar.Recently,applying deep neural network encoder-decoder framework to constituency parsing has become a research hotspot.The best performing deep neural network constituency parsers so far have used a multi-head self-attention encoder and a global or local decoder to generate well-formed tree structures.However,there are still two problems in the process of encoding and decoding:1.The redundancy of multi-head self-attention encoder and the loss of temporal order and positional information.2.Local constituency parsing models suffer from poor robustness and have weak resistance to input noise.In this thesis,we mainly study the constituency parsing based on deep neural network.In view of the above two problems,our main work can be summarized as the following two points:1.We proposed the Constituency Parsing with a Multi-Positional Contextual Self-Attention Encoder.To be specific,we proposed a multi-dimensional self-attention encoder to constituency parsing,with novel positional masks to encode temporal order and positional context information of the input text.Without stacked structure,our encoder outperforms complicated Transformer encoder on both prediction quality and time efficiency in constituency parsing.After trained on the Penn Treebank,our parser achieves 95.91 F1 on the WSJ test set,and also outperforms previous best results on 7 of the 8 languages in the SPMRL dataset.Our proposed encoder also achieves SOTA results on text classification.2.We proposed Enhancing the Robustness of Local Constituency Parsing with Adversarial Training.To add input noise to the training sample and enhance robustness,random noise is introduced to training samples by perturbing the label-structure space in existing constituency parsing models.In this thesis,we introduce the system noise by adversarial training.We set up a general perturbation system with(p,q)-perturbation strategy where random noise and system noise are used to perturb the training samples with probabilities p and q,respectively.By incorporating the(p,q)-perturbation strategy into a self-attention encoder and a greedy top-down decoder,we propose a novel Adversarial Constituency Parsing(ACP)model to learn syntactic structures.After trained on the Penn Treebank,our ACP model achieves competitive parsing performance on the WSJ test set.Moreover,to analyze the robustness,we construct a mini dataset and perturb it.The result shows that the robustness of our ACP model is enhanced.
Keywords/Search Tags:Constituency Parsing, Self-Attention Mechanism, Deep Neural Network, Robustness, Adversarial Training
PDF Full Text Request
Related items