Font Size: a A A

Machine-Aided Translation System Based On Multilingual Parallel Corpus

Posted on:2008-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiuFull Text:PDF
GTID:2178360218957278Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With more frequent international exchanges, the translation industry can be flourishing. And a great number of information construction have completed in the recent years. One of the most prominent is the introduction of a machine translation or machine-assisted translation product or the translation process automation, and it used to replace or assist the activities of human, which has raised the efficiency of the translation industry. Meanwhile in many universities, research institutions engaged in the machine translation and machine-aided translation technology, and achieved fruitful results. But the products can not be applied in market. Particularly for machine-assisted translation products, it needs a lot of resources to test and can't be tested under a larger scale and real corpus. Then it limits the overall software quality.Through co-operation with the Chinese Translation and Publishing company, and on the basis of its 30-years accumulation of literature. We constructed a large multilingual parallel corpus. In order to guarantee the size of the corpus and the quality of the corpus, We standardize the process of construction and operation. On the basis of the existing resources, the development of machine-assisted translation software will be used to improve the efficiency of translators.This dissertation describes the process of building a corpus and the key algorithms used. It designs a computer-aided translation software and implements some key algorithm. The algorithm for duplication check of Chinese texts combines Chinese linguistics research results. It presents a new method to extract text feature. In the whole design process, this paper thinks machine-aided translation software based translation memory as a search engine based on the full-text indexing. It is used to provide the translation of the same or similar sentences. Through introduction of segment by word, It expand the concept of Segmentation in Chinese information process. Then it can solve index of Chinese text and all alphabetic language text .Therefore it guarantee the expandability for language.
Keywords/Search Tags:Corpus, Translation Memory, Computer-Aided Translation Software
PDF Full Text Request
Related items