| The contents on the Web are increasing exponentially as the rapid development of the Internet. A problem how to obtain the useful information from vast contents quickly and accurately is facing us while people are enjoying the convenience of the Internet. The solver of this problem is Web Search Engine. Nowadays, the thriving of research on Web Search Engine in various research institutes meets the need of solving this critical problem.A Web Search Engine is a kind of special web page available for Internet information retrieving. It collects various web pages through robots called Crawler, and stores the informationg into databases after the original web pages being analyzed. When the web surfer inputs keywords he wants to know, the Web Search Engine searchs the indexes in its database and fetches relative web pages for the user.A reuseable. extensible searh engine system which name is Hicode that has been implemented by open-source pakage named "Lucene". It can searchs the source code file which exists in web and local system. It can also easily locate the segment of some source code and the position of the origin file.The Lucene pakage and open-soure tools which used by hicode search engine system is introduced first. Then three basic components (Crawler, Indexer and Searcher) are implemented by java techonology. The Crawler comonent is design with multi-thread techonology, it using thread-pool to manage the crawl thread, so the crawler can parallel crawls the web pages. The components of indexer and searcher are using the Lucene framework; an Chinese word segmenter which is more effective than lucene's original segmenter was implemented. The serialization techonology which can improve the effcient of indexing is used by the hicode framework. The JavaCC tool is used to improve the speed of development. At last, the hicode's integretion of the liferay portal is introduced. |