Font Size: a A A

Coreference Resolution Research In Uyghur Pronouns Based On Deep Learning

Posted on:2018-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:D B LiFull Text:PDF
GTID:2348330533456156Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In natural language expression,the reference is a common language phenomenon,while simplifying the expression,to the machine to understand the natural language has brought great challenges.Pronouns anaphora resolution is one of the hotspots in the field of natural language processing.The purpose of the study is to use context information to determine the direction of pronouns in sentences.With the rapid development of the Internet,human into a high-speed development information age,and a large number of Uygur text are produced on the web,the analysis of these text has great research value,which provides some theoretical support and technical support for the Uygur intelligence analysis.In the existing pronouns anaphora resolution research method,use the semantic information such as syntactic and lexical semantics to achieve the anaphora resolution of pronouns,and rarely consider the deep semantic association.At the same time,compared with the English and Chinese,the anaphora resolution research in Uygur is relatively small.This paper focuses on the anaphora resolution research of Uygur pronouns,the specific researches are as follows.Firstly,through the study of the corpus that has been used to anaphora resolution,combined with the characteristics of Uygur language,and under the guidance of Uygur linguists,we determined the data source,the subject matter and the annotated rules of Uygur language corpus,then construct the Uygur language anaphora resolution corpus.Secondly,according to the constructed corpus,studied Uygur personal pronouns anaphora resolution.First of all,extracting the shallow semantic features of these expression pairs,including gender,singular number,part of speech,position and distance,and Uygur grammar,which are used to construct the eigenvector of expression pairs.Then using the advantage of the deep belief network in dealing with the problem of complex classification,built out a platform for anaphora resolution to achieve the Uygur personal pronouns anaphora resolution.At last,it was proved by the experiment that combine with the rules of the expression pair,this platform can automatically extract the deep semantic information of the expression pair and improve the effect of the anaphora resolution.Thirdly,the author explores the influencing factors of Uygur pronouns anaphora resolution tasks,especially the anaphoricity determination.In addition to considering the coreference element itself in the study,the context information is also taken into account.For the coreference element,we first extract the shallow explicit semantic information of the elements which to be digested,such as part of speech,distance,and grammar,then construct the shallow semantic features representation of the elements.For the context information,in order to ensure that the dimensions are consistent,a set of vectors with same dimension of the shallow explicit semantic information is randomly generated.After that,using word2 vector tool to obtain the deep recessive semantic features of coreference element and its context respectively,then combine the two representations as the input data.Finally utilizing the advantage of convolution neural network in dealing with local information processing,proposed a pronoun anaphoricity determination model based on the neural network.After experimenting,compared to the previous use of shallow semantic information alone and ignore context information,our method has been significantly improved.
Keywords/Search Tags:Coreference resolution, Uyghur, Pronouns, Anaphoricity Determination
PDF Full Text Request
Related items