Font Size: a A A

The Optimization And Implement Of Enterprise Search Engine

Posted on:2011-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:C B ZhangFull Text:PDF
GTID:2178360308464569Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays, there are many information sources on the intranet of an organization. The ability of retrieving the information we need rapidly and accurately is very important for us. Enormous amount of information on the Internet, and organization of knowledge that exist within the number of documents are also explosive growth trend. In such circumstances, can the organization faster, more accurate search their information has practical significance. Research and Implementation of enterprise search engine optimization and extraction technology, will allow the limited resources of various organizations from which to collect more information in order to achieve efficient corporate internal and external data access and effective organization.In this paper, we focus on the design and implementation of campus search engine, add more features to provide users with a better user experience and access efficiency and retrieval performance. Here are some of the major works of this thesis:1, page crawling stage, the system's fault tolerance mechanisms added crawling step, the system error in the crawling stage, when unexpected, can be rolled back between steps, automatic updates;2, using the asynchronous transfer mode message, the background search logic and display logic block front, avoid long for clustering, returning the result caused by slow users wait too long before results are returned quickly presented to the user, and improve the retrieval performance in Chinese by changing the logic analyzer to enhance the user experience.3, add a picture for the Photo-Summary and Published charts function in the page analysis phase. Combined with the page block, page classification rules, extract relevant images from the page, as the page's description, and extraction Page Published integrate a page Published charts, allowing users to more quickly and accurately locate the resources they need.The system was demonstrated, and the information extraction module in more detailed experimental testing and results analysis.
Keywords/Search Tags:Text Clustering, Photo-Summary, Page Publish Time, Page Segmentation
PDF Full Text Request
Related items