Font Size: a A A

Research On Chinese-English Word Alignment

Posted on:2005-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:D DengFull Text:PDF
GTID:2178360185495518Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Word alignment is a basic problem of Cross-lingual Natural Language Processing. Many NLP tasks based on bilingual corpus such as SBMT, EBMT, WSD, Automated Dictionary Extraction need to align words.Previously proposed word alignment methods pay not enough attention to bilingual dictionary. Here a large scale bilingual dictionary enlarged by integrating several human-readable bilingual dicitonaries is the main cause to improve the word alignment result. A Chinese-English word alignment algorithm based on bilingual dictionary is introduced. It is inspired by Ker's method. This method mainly depends on similarity measured by bilingual dictionary, relative distortion information and Part-of-Speech information to align words. By setting alignment window it acquires many-to-many word alignments. On a test set of 650 translation sentence pairs of Chinese and English, in which Chinese sentence has 24.8 words in average and English 34.5, the word alignment system gets a result of recall 62.9% at the precision of 84.0%.Our algorithm is improved on Ker's in these aspects:1. The computation of relative restortion of Ker is improved, and the initial alignment anchors chosen by dictionary-based word similary is added to improve alignment.2. Proposed a concept of'alignment window'. By setting alignment window in the aligning process, many-to-many word alignments can be found .
Keywords/Search Tags:word alignment, alignment window, human-readable bilingual dictionary, machine-readable bilingual dictionary
PDF Full Text Request
Related items