Font Size: a A A

Research On Automatic Extraction Of Chinese Terms

Posted on:2012-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:C S LiuFull Text:PDF
GTID:2218330338963054Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Automatic Extraction of Chinese Terms is a fundamental issue in Chinese Information Processing; it has a very important role in many fields. In linguistics, such as natural language generation, computational lexicography, parsing, corpus linguistic research and so on. In the field of Natural Language Processing, such as machine translation, information retrieval, text classification, text summarization, domain ontology and so on. Especially for area corpora, term extraction seems more significant.At present, many Chinese scholars committed to the automatic term extraction and had put some effective methods, but the overall level of technology is not mature; Because the Chinese language has the characteristics of adhesion, and dose not like in English, there is a space between words, so the process of Chinese language is more complex than English, and the method of foreign terms extraction is not very suitable for Chinese term extraction. Therefore, to develop suitable way for acquisition Chinese domain terms are very important for Chinese Information Processing. In this context, the article does much in-depth research of Chinese Term Extraction, and gives a method of automatic term extraction. The main works are as follows:First, summarized the performance characteristics of a variety of term; analyzed the characteristics of various methods of automatic term extraction, and compared the research of automatic term extraction in domestic and international field.Second, based on analyzing the VSM and the word frequency, the article puts an improved TFIDF method which uses to select the domain texts and proves the validity by experiments. The method can select a sort of domain texts form mixed texts.Third, introduced Bayesian inference into the field of term extraction, and then researched the formulas of Bayesian inference process, in the last, given the process of domain term extraction based on Bayesian inference, and designed the core module.
Keywords/Search Tags:domain term, automatic term extraction, VSM, TFIDF, Bayesian inference
PDF Full Text Request
Related items