Font Size: a A A

Study On Word Alignment For Re-ordering Of Web-mined OOV Translation Candidates

Posted on:2009-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:S LiFull Text:PDF
GTID:2178360272965179Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With rapid development of the World Wide Web, more and more information on the network can be found, an effective information-processing technology——the Web Text Mining technology has been widespread concerned by researchers. At the same time it provides an effective way, which can automatically translate unknown word (Out of Vocabulary, OOV) quickly and accurately.It is studied that the re-ordering of web-mined OOV translation candidates in this paper. Automatic Word Alignment technology is applied to calculate the weighted points for each candidate, in order to rank right translation candidate with top position and improve the web mined result which has been sorted by frequency. The specific contents are follows:1. Firstly the Web Text Mining and Web Information Retrieval Technology of OOV words are introduced. Also the research background, research purpose and some basic theoretical knowledge about them are said in this paper.2. Introducing an approach, it is in essence to search for mixed-language web pages containing the OOV and its translations, and then mine the translation via statistical measures. Within the returned snippets we can find both the Chinese OOV and the segment translations, which are strong hints for the OOV translation.3. Automatic Word Alignment technology is studied. The hybird word alignment solution is presented by combining the statistic-based method with lexicon-based method and linguistic knowledge.4. The realization method and experiments of re-ordering of web-mined OOV translation candidates based on word alignment Technology.From the experiment result, word alignment technology is very helpful to improve the correctness of OOV automatically translation based web searching.
Keywords/Search Tags:Web-based Data Mining, Word Alignment, OOV Translation, Natural Language Processing
PDF Full Text Request
Related items