| With the rapid popularization of computer networks, the people already entered the information age. In the information society, the importance of information is increasing. Then a great deal of useful information would be acquired and mastered, whoever individuals, businesses, and even the Government. In this circumstance, Chinese information processing technology has gradually become hot spots for development technology. One of the most important technologys of Chinese information processing technology is Chinese word segmentation.Chinese word segmentation technology means a process which using the corresponding word segmentation algorithm to separate the text and easily to deal with and understand the information by computer. Its range of applications is wide, mainly used in information retrieval, information extraction, machine translation, natural language processing technology and so on. At the same time, it includes many aspects, such as Chinese word segmentation algorithm, unknown word recognition technology, ambiguous word processing technology, and so on. Ambiguous processing technology and unknown word recognition technology is two difficulties of Chinese word segmentation technology. In this paper, it will mainly study the word segmentation algorithm and ambiguity processing techniques of Chinese word segmentation.Firstly,in this paper, it used a typical Chinese word segmentation algorithm based on dictionary - the Largest Forward Matching Algorithm. The idea of it is simple and easy to implement, but the result of the segmentation accuracy and the segmentation speed seems to be not ideal. For the problem, in this paper, it uses the double-hash structure dictionary mechanism for improving the speed of word segmentation, as well as in this paper proposed an improved Largest Forward Matching Algorithm for improving the accuracy of segmentation.Secondly,ambiguity processing technology is one of the important components of Chinese word segmentation technology. Only deal with the ambiguity field completely, it would segment the text correctly. Therefore, it use a disambiguation algorithm combine with probability and rules behind proposed the improved Largest Forward Matching Algorithm in order to achieve better segmentation results.Finally, we concern the system performance measurement metrics, such as precision, speed, realizable and so on. Then a design of the Chinese Word Segmentation System is given and the Chinese Word Segmentation System is achieved.At the same time, the experiment results prove that this system achieves a certain degree of segmentation effect. |