Font Size: a A A

Research On Chinese Predicate Framework Based On Abstract Semantic Representation

Posted on:2020-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:L SongFull Text:PDF
GTID:2435330578977117Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Semantic analysis is a key issue that needs urgent breakthrough in natural language processing.When analyzing the meaning of a sentence,the various semantic relations contained in the event frame of the predicate constitute the backbone of the sentence structure.Thus,the study of predicate frames is a keystone of semantic analysis.The current study of predicate frames in linguistic fields has given rise to several theories,including Valency Theory,Case Grammar and Frame Semantics.Many resources,such as FrameNet,VerbNet and PropBank,have also been established.However,there are some problems in the current study of predicate frames.(1)There exist deficiencies in the definition and granularity of the semantic roles of predicates.(2)The rationality of static predicate frame lexicons is lack of tests from dynamic corpora.(3)The mixture of senses and frames of predicates neglects the case that senses and frames do not exactly one-to-one with each other.(4)The semantic relations within the frames of predicates are neglected in the lexicons.We propose the following solutions to these problems.(1)We discuss problems of common lexicons in definition and granularity for semantic roles,which can be solved in CAMR.We compare the current definitions and granularities of semantic roles,and find that the predicate frame system adopted by CAMR has advantages in representing meaning.For one thing,CAMR uses 5 predicate-specific core semantic relation labels,which can better handle the conflicts between core and non-core semantic roles,and represent multi-functional core semantic roles.For another,CAMR uses 44 predicate-general non-core semantic relation labels that are fine-grained and have a satisfactory discrimination.In addition,CAMR allows adding back dropped or omitted semantic roles,which contributes to a more complete representation of sentence meanings.(2)We test the CPB predicate frame lexicon through annotating dynamic CAMR corpus.Based on the annotation and analysis of dynamic corpus,we find that the quality of the lexicon which is directly extracted from the manual annotated corpus is inevitably affected by the system,scale and quality of the corpus.In addition,the lexicon has a systemic problem of mixing senses and frames of predicates,which cannot be solved by modification.Therefore,we decide to reconstruct a Chinese predicate frame lexicon which is applicable to CAMR annotation guidelines in an introspective-based way.(3)We annotate senses and frames respectively for predicates and discuss their corresponding relations.When constructing the new lexicon,the senses and frames of predicates are separately numbered in the new lexicon,so they are interrelated but independent.We analyze the correspondence cases between the senses and frames of predicates,and find that the ratio of senses to frames is 1.33:1,and there are only 25.24% of multi-senses words having a one-to-one relationship between their senses and frames.Then we analyze how the meanings of multi-senses words evolve according to the frames,and discuss why the senses and frames of multi-senses words correspond differently.Furthermore,we summarize that one sense corresponding to multiple frames is mainly due to the disunited criteria for segmenting and merging senses in the dictionary(Modern Chinese Dictionary),and words being used to modify the directional relationship between two concepts.(4)We annotate core semantic relations among core semantic roles of predicates and analyze their characteristics.Sometimes there is a core semantic relation among predicates’ core semantic roles,and it is a major reason for the graph structure of sentence meaning.Because these relations have been neglected in previous predicate frame lexicons,we annotate them in the new lexicon,and analyze the type distribution of core relations among core roles when the core roles are in different numbers.We find that the type of core relations among core roles mainly depends on the particularity of the predicates’ event frames and has unique characteristics.We also find that the dynamic nature of core relations among the core roles is mainly because the core roles themselves are dynamic,the relations are influenced by context,and the core roles are omitted.Generally speaking,our work of this paper is as follows.We first discuss the advantages of the predicate frame system adopted by CAMR,then analyze the problems of the CPB predicate frame lexicon used by CAMR based on dynamic corpus,and propose to build a new lexicon.Next,we formulate an annotating scheme of the new lexicon and carry out the construction work.The senses and frames of predicates are separately numbered,and the core semantic relations of core semantic roles are annotated.Finally,we make some statistical analysis and theoretical investigations.The conclusions of this paper are as follows.First,the predicate frame system of CAMR can provide a better solution to semantic role labeling in definition,granularity and adding concepts.Second,the senses and frames of predicates are not exactly one-to-one correspondent.Third,the core semantic relations among core semantic roles of predicates are particular and dynamic.
Keywords/Search Tags:Abstract Meaning Representation, predicate frame, semantic role, language knowledgebase
PDF Full Text Request
Related items