Font Size: a A A

Research And Realization Of Extracting Translation Of Chinese Term From Web And Online Dictionary

Posted on:2013-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2248330362471869Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Terms are the core concepts of the specific domain which carrying rich domaininformation. Due to the continuous growth and change of term, term translation is one of thebig problems of machine translation and information retrieval. Method based on statisticsand rules both meet certain difficulties. This paper use Web as corpus, combined with Webmining and knowledge acquisition technology, extraction English translation of Chineseterms. This not only contributes to the problem of terminology translation, realizesemi-automatic building of dynamic term bilingual dictionary, but also promote the researchof crossing-language information retrieval and cross language knowledge acquisition. Themain work of this paper includes the following aspects:1) First we discuss the main problem, difficulty and present research status ofWeb-based translation acquisition, and then give the basic process and idea of ourtranslation system. This paper analyzes the defects of the past research and gives the idea“retrieve-extract-verify”.2) Base on the web-based information extraction technology and semantic predictprinciple, use partial translation to build query string, return terminology translation relatedweb page from search engine. This solves the problem of acquiring terminology translationbilingual concurrence corpus. High quality terminology translation related corpora is a goodfoundation for the next term extraction process.3) Use knowledge acquisition techniques, combined with semi-structured text analysismethods, statistical and rules integrated method to extract term translation from web pages.Proposed a template-based, dictionary mode and position mode of the three extractionmethods combined extract ideas, maximizing the accuracy of the results under the premiseof return rate.4) In order to exclude the noise data in the translation results, we propose threeverification modes: side analog alignment verification, bilingual alignment degreeverification and word-building verification using manual sort term bilingual aligned corpus,verify the candidate translations in sufficiency but unnecessary ways. Term translationverification process ensures the accuracy, which improve the availability and reliability ofthe system.5) Get translations of commonly used term from online Chinese-English dictionary, ensure accuracy and improve the efficiency of translation acquisition system.Experiment of terms from different areas shows that our Web and online dictionarybased term translation method and system has a good accuracy,significantly improvedcompared with previous research. And our system takes less time which improves usability.
Keywords/Search Tags:Terminology translation, Web mining, Knowledge acquisition
PDF Full Text Request
Related items