Font Size: a A A

Research On The Method Of Chinese Shallow Semantic Parsing For Text-to Scene Conversion

Posted on:2012-12-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Q LiFull Text:PDF
GTID:1118330362950256Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In this paper, we conduct comprehensive and deep academic research on the key issues of Chinese shallow semantic parsing (SSP). SSP is an essential research in the area of Natural Language Processing (NLP). Currently, the method based on linguistic features and statistical machine learning is the most prevalent method for SSP. The method involves two key factors: selection of linguistic features and optimization of machine learning method. Additionally, the SSP rearch in this paper is oriented to the Text-to-Scene conversion, which is to automatically convert the natural language text to the corresponding scene or animation by computer. It is a novel research area that has important theoretical and practical significance. Firstly, we study coreference resolution which is a necessary pre-processing module for the Text-to-Scene conversion. Secondly, we explore the issue of SSP from the perspective of linguistic feature selection and discover many discriminative syntactic features. Then, we propose a combined machine learning method to further improve the SSP. Finally, we study the issue from a deeper level and proposes a computational cognitive model-based approach to SSP. Specifically, this paper includes following contents:(1) Firstly, we propose an Adaptive Resonance Theory (ART) network-based unsupervised noun phrase coreference resolution method for Chinese. The method makes fully use of the features of noun phrase. It can dynamically control the amount of cluster by adjusting the parameters of the ART network. Thus it provides an effective solution to the critical problem of cluster-based coreference resolution that the output cluster number, namely the number of the coreference set is difficult to determine or evaluate. Additionally, in the clustering algorithm, we use an information gain ratio-based feature selection method to reduce the interference caused by some weak clustering features. This method achieves a relative high accuracy of coreference resolution and it has good portability and robustness. It addresses the major obstacle of the pre-processing phase in the Chinese SSP in the Text-to-Scene conversion.(2) Then, we intensively study the linguistic features for Chinese SSP and then propose a multiple syntactic features-based method. Current researches show that improving the linguistic feature set is the most effective method to enhance the performance of SSP at present. The proposed method integrates constituent-based and dependency-based syntactic features into a basic feature set, and thus provides more extensive and complementary syntactic information to SSP. Further we propose a statistical combined feature selection method on the basis of the basic feature set. The statistical method can efficiently select discriminative combined features according to the distribution of each combined features in the corpus. Finally, we use the constituent-based syntactic features, the dependency-based syntactic features and the selected combined feature for classifications in SSP. Experiments show that the proposed method achieves better results on both gold-standard and automatic syntactic parsing.(3) Further, we propose a combined machine learning method which is to improve SSP from the perspective of optimizing machine learning method. The proposed method is based on the above mentioned multiple syntactic features. It adopts five basic machine learning methods: K-Nearest Neighbor, Decision Tree, Perceptron, Maximum Entropy, and Support Vector Machine. We construct the five classification model using the five machine learning methods on the training corpus as the basic unit of the combined model. Then we use an input-dependent gating system to integrate the five basic classification models, and control the output of the combined model by adjusting the parameters of the gating system. Finally, we use Expectation Maximization algorithm to learn the parameters of the gating system using training data, and experimental results show that the method can significantly improve the effect of Chinese SSP.(4) At last, this paper proposes an exploratory computational cognitive model-based Chinese SSP method. On basis of cognitive theory, the method simulates the language understanding process of human and then explores semantic analysis and calculations from fundamental properties. First we define propositional semantic representation oriented to the cognitive model and the Text-to-Scene conversion. The propositions can simply and efficiently express the semantics of natural language. We take the propostions as the neurons of the cognitive model. Then the contextually appropriate propositions will be gradually strengthened and inappropriate ones will be inhibited through iteratively spreading activations until the network stabilizes. Finally, the result of SSP can be achieved according to the activated propostions in the cognitive model.
Keywords/Search Tags:shallow semantic parsing, semantic role labeling, natural language processing, text-to-scene, computational cognitive model
PDF Full Text Request
Related items