Font Size: a A A

Research Of Chinese Frame Identification Technology

Posted on:2012-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:H J LiuFull Text:PDF
GTID:2218330368489861Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Frame Identification is one of the subtasks of SemEval-2007 semantic evaluation 19th Task "Frame Semantic Structure Extraction" based on FrameNet. Given a sentence and the word expressions (target words) that evoke semantic frames, Frame Identification is discriminating the word sense (frame) of each evoking expression. This research is similar with WSD, but the latter pay attention to statically deciding which meaning of the ambiguous word in the current sentence closer to the meaning in the dictionary, while the former based on frame semantics which is a dynamic scene semantic, identify which candidate frame express the same meaning with the current sentence, according to the participate and semantic roles.In this paper, regarding to Chinese sentence, our research content is Chinese Frame Identification based on Chinese FrameNet.Currently, in Chinese FrameNet there are 332 lexical units which belong to more than one frame. We choose 7 of them for research and select 1000 more sentences from Sogou corpus and CCL Contemporary Chinese corpus. After refining, they made up of the experiments corpus. For these sentences, we use machine learning method based on dependency parsing for Chinese Frame Identification research.The major research contents and conclusion contains:(1) Based on theory of sequence labeling, use Tree-Structured Conditional Random Fields (T-CRF) model after dependency parsing of sentence for the feature selection and parameter estimation for frame identification, meanwhile, compare with CRF model for frame identification.(2) Based on classification, build SVM classifier for each ambiguous target word after dependency parsing of sentence for the feature selection, parameter estimation and kernel function selection.(3) Contrast experiments:On the basis of Co-occurrence collocation in generalized collocation theory, we proposed a method named compatibility of lexical unit for frame identification, and use most-frequent frame for baseline. Experimental results show that, using machine learning method and feature selection on dependency parsing tree can capture more important characteristics and implicit semantic between words in sentence, which is helpful for Chinese Frame Identification.
Keywords/Search Tags:Frame Semantic Structure Extraction, Frame Identification, T-CRF, Co-occurrence Collocation
PDF Full Text Request
Related items