Font Size: a A A

Japanese Time Expression Recognition And Translation Based On The Combination Of Rules And Statistical Models

Posted on:2015-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhaoFull Text:PDF
GTID:2298330434950600Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Traditional time expression recognition technologies mainly include time series labeling method and rule-based method. In this paper, we propose a method considering the character of Japanese time based on statistical model and rule set strengthened by knowledge base. According to the Timex2standards’granular classification on time expression, we progressively expand and reconstruct the knowledge base given the Japanese time characteristic, and then optimize and update rule set to increase recognition accuracy. Simultaneously, we adopt CRF model to enhance the generalization ability of Japanese time expression recognition. Experimental results show that the proposed method is simple and efficient, which overcomes the shortcomings that poor portability and severely dependency on resource of the two traditional methods. This method achieves to build a high quality system of recognizing and translating Japanese time expression using limited corpora.Moreover, we carry on Japanese-Chinese time expression translation experiments respectively by the methods that one phrase based models of ATM with Moses and another one combined Japanese-Chinese time expression translation rules with Japanese-Chinese time parallel dictionary on key words, finding the necessity of combination both the two methods. Then, we propose a fusion strategy to combine statistical machine translation (SMT) models with translation rules. Experimental results show the result of our proposal method is the best in those of the three methods.Rules and statistics fusion strategy is our major notation. On the one hand, based on error-driven method, we use the results of the method of statistical models to modify artificial heuristic rule template, knowledge base and Japanese-Chinese time parallel dictionary. Then we realize the method based on rules. On the other hand, the system uses the results got by method based on rules to improve the training corpus quality. So repeatedly, until the improved system performance has better constringency. Our fusion strategy improves not only the experimental results, but also the generalization ability of our system.
Keywords/Search Tags:Knowledge Base, Rule Set, Statistical Machine Translation, Japanese-Chinese Time Translation Rule, Japanese-Chinese Time Parallel Dictionary
PDF Full Text Request
Related items