Font Size: a A A

Construction Of Chinese Theme-Rheme Annotation Corpus And Study Of Automatic Analysis Of Chinese Theme-Rheme Structure

Posted on:2018-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:D W TianFull Text:PDF
GTID:2348330515960093Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,discourse analysis has attracted more and more attention in natural language processing,which is of great significance.But we have noticed that the resources of discourse analysis of Chinese discourse corpus are relatively scarce,and the resources we can use are mainly based on rhetorical structure system(RST),which focuses on rhetoric and logic semantic relations in discourse that can not meet the comprehensive needs for discourse analysis.At the same time,the functional information in discourse is of great guidance and help for discourse analysis,therefore,we fulfilled the construction of Chinese Theme-Rheme corpus by introducing systemic functional grammar of Halliday,and made related research on automatic recognition,all of which can be explained as follows:In this paper,we established the corpus annotation framework and the relevant annotation specification by introducing theory like Halliday's systemic functional grammar and thematic progression pattern.On the basis of the annotation framework and instruction,we have developed the Chinese Theme-Rheme annotation tool to help simplify the annotation work.We chose 525 news texts from ontonotes as raw material for our annotation and organized relevant personnel to make practice work.After completing the annotation work,we checked annotated results and screened out right ones.Finally,we made a detailed statistical analysis of the results,which may help us with the automatic recognition work and explain the relevant Chinese language phenomenon.In order to expand the corpus,we did some research on the automatic recognition and analysis of Chinese Theme-Rheme structure,including the automatic recognition and analysis of themes and rhemes in Chinese texts and thematic progression pattern.To explore themes and rhemes in texts,we designed a model composed of three annotation elements by using conditional random fields,which achieved good effect.To find phenomena like anaphora and ellipsis accounting for the largest proportion of the pattern of thematic progression in texts automatically,we adopted the idea of classification and used multilayer feedforward neural network to carry out experiments,which had good results.
Keywords/Search Tags:systemic functional grammar, thematic progression pattern, Chinese Theme-Rheme annotation, corpus construction, automatic recognition and analysis
PDF Full Text Request
Related items