Font Size: a A A

The Search Engine Based On Chinese Natural Language Processing

Posted on:2012-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:H T LiuFull Text:PDF
GTID:2178330332494855Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The information generated and stored in the network greatly enriched the lives of people and brought great convenience. At the same time, searching useful information from the internet has become one important aspect of the practical research. Search engine is such a"compass"that could help people to tackle with this problem. It can extract the information from the network, then organize and process it. In this paper, we developed a search engine based on Chinese natural language understanding. And this search engine is applied in higher art education website.There is a growing hope that people can use the human natural language to operate the search engine. The main aim of this paper is to make the search engine understand human natural language and extract the key content from natural language. Then, it can use the extracted content to retrieve information.This article uses Lucene as a research platform to develop a search engine based on Chinese natural language processing. Chinese text feature is very different from English text. It does not use spaces between the words and words. So we use a dictionary based method to cut the Chinese text into tokens. The structure of the dictionary is a hash table, which is indexed by the first Chinese character within a token. When carrying out the sort query results, the user input text and the existing texts are mapped into n-dimensional vector space, and then define the measure of similarity between two vectors. The framework of the website and Lucene is based on Java language, but some useful functions of this website are implemented by DLL file. Therefore, we also studied the method of importing DLL files into Java platform.Finally, the performance of the search engine is validated by experiments. The experimental results show that the design concept and its application are feasible and practical. It effectively improves the quality of the information retrieved, and significantly enhanced interact ability of the search engine.
Keywords/Search Tags:Search Engine, Natural Language Processing, Lucene, Chinese Search Engine, Website Building, Interaction
PDF Full Text Request
Related items