Font Size: a A A

The Research On Bilingual Syntactic Phrase-based Statistical Machine Translation

Posted on:2014-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:P DingFull Text:PDF
GTID:2248330398450365Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Corpus Linguistics and the improvement of computer performance, the effect of the machine translation is getting better and better and machine translation acquires wide applications. As the researchers unceasingly inquire into new methods of machine translation, from Word-based Statistical Machine Translation to Phrase-based Statistical Machine Translation and Syntax-based Statistical Machine Translation, the performance of machine translation is improving continuously.Phrase-based Statistical Machine Translation looks phrase as the basic unit. It makes the best of word order within a phrase and gets better performance. But Phrase-based Statistical Machine Translation uses little linguistic information with poor long-distance reordering. Syntax-based Statistical Machine Translation regards grammatical phrase as basic unit and makes the best of syntactic information. But Syntax-based Statistical Machine Translation is seriously affected by parsing errors. At the same time, Syntax-based Statistical Machine Translation demands strict syntactic phrase, which lead to loss of useful non-syntactic phrases.Considering the disadvantages of Phrase-based Statistical Machine Translation and Syntax-based Statistical Machine Translation, we come up with Bilingual Syntactic Phrase-based Statistical Machine Translation, which improve machine translation performance using syntactic phrases. First of all, we put forward a bilingual syntactic phrase extraction method based on EM algorithm. Then the extracted bilingual syntactic phrases are applied to the Phrase-based Statistical Machine Translation System in three ways:(1) Add extracted bilingual syntactic phrases to training corpus, then retrain translation model.(2) Add extracted bilingual syntactic phrases to phrase table, then recalculate the phrases characteristic value for each feature.(3) Add a syntactic phrase feature to phrase table, if the phrase in phrase table is syntactic phrase, the characteristic value is "1", or the characteristic value is "0".Experiment result shows that bilingual syntactic phrases can improve machine translation performance. Three methods improve the BLEU value of the translation. The BLEU value of baseline system is0.2253, method (1)0.2276, method (2)0.2294, method (3)0.2317.
Keywords/Search Tags:Statistical Machine Translation, Phrase Table, Bilingual Syntactic Phrases, EM Algorithm
PDF Full Text Request
Related items