Sentence-level Chinese language processing can give basic supports for applica-tions of natural language processing (NLP). It includes morphological, syntactic and se-mantic parsing, where morphological parsing consists of word segmentation and POS-tagging; syntactic parsing has two major tasks: constituent parsing and dependency pars-ing; and semantic parsing refers to Chinese semantic dependency parsing in this disserta-tion. There is a hierarchy structure between these tasks. For a given sentence, we usuallyconduct word segmentation first, and then POS-tagging, and thirdly constituent parsingor syntactic dependency parsing, and finally semantic dependency parsing.Traditional methods process the above tasks step by step with their state-of-the-artmodels independently. These methods are usually named by pipeline methods. Theyhave two major drawbacks. First, they sufer the error propagation problem, where theerrors in lower-layer tasks will spread to higher-layer tasks. Second, since they optimizea single-task model locally, lower-layer tasks can not use the information from higher-layer tasks. Recently some researchers pay more attention to joint models, which processmultiple adjacent tasks with a single model, so that the above problems can be avoidedand improved performances can be achieved. Another advantage is that joint models canfacilitate language researchers to understand the relations between diferent tasks. In thispaper, we study the joint models based on four points, as shown in the following:First, we study the domain-adaptation problem of the joint model of word segmen-tation and POS-tagging in morphological analysis. Target-domain data annotation is oneof the most efective ways to enhance the target-domain performances. We propose amethod to make the strategy of data annotation better. We make a combination of thesentence-level token-annotation and the word-level type-annotation for the target-domaindata annotation. By this combination way, we can achieve better performances for thejoint word segmentation and POS-tagging model with fixed cost. We also verify the pro-posed method by experiments.Second, we investigate the low-efciency problem of the joint model of POS-taggingand syntactic dependency parsing. We propose a combination of model integration andup-training to speed up the joint POS-tagging and syntactic dependency parsing modelswithout losses of performances. On the one hand, model integration enables us to obtain a higher-performance joint model, but the decoding speed would become much slower. Onthe other hand, up-training enables a fast joint model with lower performances to obtainlarge performance improvements with the help of the integrated joint model aforemen-tioned. Our final joint model are not only ten times faster than the original joint models,but also its performances are a litter higher.Thirdly, we propose the character-level Chinese parsing based on the fact that mostChinese words have internal structures, so that we can perform Chinese morphologicaland syntactic parsing with a single joint model naturally, and gain the joint Chinesemorphological and syntactic parsing models. The experimental results show that thischaracter-level parsing methods can get better performances than other related models,achieving the top accuracies.Finally, we propose a joint Chinese syntactic and semantic parsing model based onthe dependency analysis. With the help of dependency parsing, we can jointly modelingChinese syntactic and semantic parsing very conveniently, and thus we can get a jointsyntactic and semantic dependency parsing model. Chinese semantic dependency parsinghas been seldom employed for semantic parsing, thus we validate the ability of Chinesesemantic dependency parsing at the begin, using both theoretical and empirical methods,and then we propose our joint Chinese syntactic and semantic dependency parsing model.We show that our joint model can improve the performances of Chinese syntactic andsemantic dependency parsing both by experiments. |