Font Size: a A A

Research On Chinese Dropped Pronoun Recovery Based On Background Semantic Information

Posted on:2021-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:J Z TongFull Text:PDF
GTID:2428330632463001Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of deep learning.Many natural-language-processing(NLP)tasks are proposed based on their diversity and flexibility,and many subjects are expected to be solved through neural network.Dropped pronoun recovery,which aims to detect the type of pronoun dropped before each token,plays a vital role in many applications such as Machine Translation and Information Extraction.Traditional methods solve the problem by constructing classifiers through feature engineering.Deep neural networks have also been applied to this task.Though promising improvements have been observed,these methods recover dropped pronouns from the limited context in a small-size window.Such a method only limits the semantic information to the unit of words and cannot obtain the long-dependent text information from the context.Based on the understanding of the background semantic knowledge of dropped pronouns,we put forward two kinds of different models for the context semantic modeling.We also interpret the characteristics of model training by attention heat maps.Experimental results show that our model in contrast with traditional model performed well.The main contents of this paper are as follows:(1)This paper presents topic model based on neural network.An unsupervised topic mode can well summarize and extract the semantic information.At the same time,subject words extracted from the whole corpus in the topic model also have a high probability to express the referent information of dropped pronouns.The validity of the model is verified on the Chinese SMS service data set,and the results of the multi-dimensional attention mechanism are visualized and interpretable.(2)We propose a knowledge-enriched neural attention framework for Chinese dropped pronoun recovery.A structured attention mechanism is used to capture the semantics of DP referents from the wider context.External knowledge,which consists of a knowledge base and a hierarchical pronoun-category assumption,is also incorporated in our model to provide pronoun classification information of referred entity and contextual dependency degree.Results on three different conversational genres show that our approach achieves a convincing improvement over the current state of the art.Moreover,the ablation experiment verified the improvement of the results by adding external deictic word knowledge and hierarchical pronoun classification criteria.Finally,through the distribution of heat map of attention,it verified the deictic situation of dropped pronoun in context during model training.(3)Based on the two neural-network framework,in this paper,we find a way to add the common-sense information.The characteristic information of the pronouns in the topic model,the referent information of pronouns and the context-dependency information of the pronouns,which are processed by different methods in our research,make the final models have better outputs...
Keywords/Search Tags:dropped pronoun recovery, semantic background information, topic model, hierarchical attentional mechanism, external knowledge
PDF Full Text Request
Related items