Font Size: a A A

Research On Real Corpus Faced Automatic Acquisition Of Chinese Verb Subcategorization

Posted on:2007-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2178360185486070Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Verb subcategorization information, mainly coding the types of distribution of predicative features, is indispensable knowledge for further development in the field of natural language processing. But the research related Chinese researches seem still weak, working on large real corpus faced automatic acquisition methods for sentence level Chinese verb subcategorization is of great theoretical significance and practical potentiality.Now the automatic acquisition method for sentence level Chinese verb subcategorization has many disadvantages. Categorizing acquisition precision is not very high. Based on previous achievements of other researches, this thesis analyzes the Chinese verb subcategorization in the sentence structures, explores many related algorithms, acquires large real corpus faced sentence level Chinese verb subcategorization information automatically, and improves the acquisition precision.In the research of explore the method of acquisition, this thesis is arranged as the follows:1,This thesis first analyze the method that based rule matching and hypothesis testing as we known, discuss the reason for its wrong acquisition. And this thesis introduces the evaluate mechanism for automatic acquisition methods.2,This thesis proposes an algorithm based on the SVM model. Through the SVM, we can find the information of the optimal classification surface. To use the two sorts'segregator, we can automatically acquire sentence level Chinese verb subcategorizations. Through the experiment, we test its feasibility.3,To solve the insufficiency of the SVM method, the thesis proposes another algorithm based on similarity computing for verb subcategorization acquisition, and use two different sentences similarity getting algorithms to the acquisition: Vector Space Model and the algorithm for sentence structure similarity getting that based on word class cluster. Based on the experiment result, we choose the algorithm based on the word class cluster structure similarity getting, and improve on it.4,In the research, the training algorithm is added to train the word weights...
Keywords/Search Tags:Chinese verb subcategorization, automatic acquisition, large real corpus, similarity
PDF Full Text Request
Related items