Font Size: a A A

Associate Degrees And Word-aligned Bilingual Blocks Access To Research

Posted on:2007-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiuFull Text:PDF
GTID:2208360185991532Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
For a long time, the machine translation systems take the word as the language translation fundamental unit. However, in the humanity natural language, the word's usage is extremely flexible, and it has the very big different meanings in machine translation processing, this causes one of key aspects which the machine translation's quality enhances with difficulty. So it's needed to introduce unit which granularity is bigger than word into translation.This article works the main content as follows:First, we introduced bilingual chunk concept in machine translation. It is one kind of bilingual language segment which granularity is situated between sentence and word. It has three characteristics: semantic self sufficiency, structure validity and transformation sufficiency. This article elaborated syntax analysis and the analogy translation thought which based on the bilingual chunk and its preliminary application thought in the IHSMTS system. Then center on the acquisition of bilingual chunk to launch the work.Second, we have conducted the research to the acquisition of monolingual chunk on monolingual corpus. Among them, we discussed several association functions, proposed three kinds of plans to extract monolingual chunk and finally used the dynamic growth mechanism to do it.Third, we have conducted the research to the acquisition of bilingual chunk on bilingual corpus. Based on the work of monolingual chunk, we proposed two models to extract bilingual chunk: Statistically-based model and Word Alignment-based model. Designed and implemented a bilingual chunk extraction system which based on association value and word alignment, achieving a good experiment result.
Keywords/Search Tags:machine translation, association value, word alignment, bilingual chunk, corpus
PDF Full Text Request
Related items