Font Size: a A A

Automatic Semantic Role Labeling Of Chinese FrameNet Based On Maximum Entropy Model

Posted on:2011-07-12Degree:MasterType:Thesis
Country:ChinaCandidate:W L WangFull Text:PDF
GTID:2178360305995795Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Based on the semantic knowledge base of Chinese FrameNet (CFN) self-developed by Shanxi University, automatic labeling of the semantic roles of Chinese FrameNet was turned into a sequential tagging problem at word-level by applying IOB strategies upon the exemplified sentences in CFN corpus, and the Maximum Entropy Fields (ME) model was adopted. In this paper, We defined Chinese FrameNet semantic role labeling as:Given a Chinese sentence, a target word and its frame, identify the boundaries of frame elements within the sentence and label them with a appropriate frame element name.The basic unit of tagging was word, then the word-level features and the base-chunk features were used. The various model templates were formed by optional size windows in each feature, and the orthogonal array within statistics was employed for screening of the better template.The experimental corpus in the paper, selected from current CFN corpus, include 6692 annotated sentences of 25 frames. For each frame, the corpus were divided evenly into four copies, then any combination of the two were used as training set and testing set. So 2-fold cross validation experimrnt can be engaged in the three different groups. In the paper, the tagging procedure is divided into three steps:1) identification,2) classification,3) post-processing. The two IOB strategies are adopted, one is conjunction of identifying and classifying, and the other is firstly identifying then classifying. In post-processing step, the final output of the sequential labels is choosen with a logical IOB sequence in entire sentence. In each step, we used the traditional information retrieval evaluation index to calculate the P, R, Fl-score, then we used the average Fl-score of the 2-fold cross-validation as the performance evaluation.The experimental results show that the F1-score of the auto-SRL system which based on the word-level features was 56.291%; After introducing the base-chunk features, the F1-score of the system was 58.011%. The later was significantly better than the former; In addition, we compared and analysised the method with the method based on syntactic, the results were significantly lower than the present method.
Keywords/Search Tags:semantic role labeling, Chinese FrameNet, Maximum Entropy Model, Chinese Base Chunk
PDF Full Text Request
Related items