Font Size: a A A

Chemical Dictionary Of Structural Design And Development Of Chinese Word Segmentation System

Posted on:2011-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:H S QiFull Text:PDF
GTID:2178360305485332Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese word segmentation in Chinese information processing is an important basic work, which is the first part of the semantic understanding. Chinese word segmentation accuracy directly influences the quality of post-semantic analysis. For search engines, Chinese word segmentation technology directly influences the results of searching, which is the core technology of search engines.This article based on the research of current technology of Chinese word segmentation, in order to make sure the technology of Chinese word segmentation apply to chemical professional search engine, designed and implemented a Chinese word segmentation system, specifically for professional chemical vocabulary, for the people of chemical professional field quickly and accurately access information.This article introduces the design and implementation of the interface and segmentor of Chinese word segmentation system in details, with the highlights on the segmentor, including segmentation dictionary mechanism and algorithm. Segmentation dictionary describes the physical structure and logical structure of the dictionary which is based on string matching method. This article proposes a structure based on the index tree of TRIE, combining with the morphological characteristics of chemical professional terms, so as to accomplish the purpose of greater accuracy of segmentation results. The position of first word hash table is the hash character within an internal code of word in computer, and the other characters can be found along the pointer; segmentation algorithm is based on the structural design of the index tree to query string with matching characters along a pointer chain. For the different directions of dictionary establish and document scanning, Chinese word segmentation system can be positive match and reverse match and then compare the difference of the results. By analyzing the results of the speed and the accuracy of segmenting test proved this system to achieve the expected goal, to meet chemical professional search engine segmentation needs for chemical industry can provide better service.
Keywords/Search Tags:chemical terms, Chinese word segmentation, TRIE index tree, positive match, reverse match
PDF Full Text Request
Related items