Font Size: a A A

Research And Realization Of Key Technology Of Enterprise Intranet Search Engine

Posted on:2015-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y YuanFull Text:PDF
GTID:2308330473451696Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the continuous development and expansion of corporate business and information construction, the scale of intranet is also rapidly expanding. At the same time, the data resources saved in the intranet also show explosive growth. In order to formulate the valid development plan based on the important data within the intranet, how to obtain valuable data resources of enterprise from the huge resources information database has become the urgent issue needed to address for enterprise.Currently, though the general search engines can provide a large number of search results so as to meet the needs of most ordinary users, the results not only dissatisfy the special demands of enterprises but also difficult to play a enough guiding role. What’s worse, there are some problems, such as the lower web coverage, the outdated information and so on. Compared to general search engines, the intranet ones can only crawl the business-related data resources and sort the search results effectively because we can modify its principal algorithms according to the needs of enterprises, which make the search results more targeted and more business-oriented. The search engine of intranet has become an effective solution to solve mentioned problems.This paper focuses on the major technologies and algorithms which were used by the search engine, and improves them according to the special needs of enterprise to complete the functions of the search engine of intranet. Meanwhile, we also concentrate on the innovation of search algorithm in complex network, which will be applied to local file retrieval system of large server, enhancing the funcitons of the search engine of intranet.This paper mainly presents three new algorithms, which are the Link Filtering Algorithm Based on Domain Names, the Multi-factors Scoring Algorithm Based on VSM and the Path Compression Search Algorithm: the Link Filtering Algorithm Based on Domain Names can prevent crawler from crawling useless data effectively by analyzing links so that the performance of web crawlers improved and the accuracy of the search results increased; the Multi-factors Scoring Algorithm Based on VSM improves the existing relevance scoring algorithms, which considers various factors that influence the score of relevance and makes the data which is more valuable to enterprise rank higher; the Path Compression Search Algorithm is a new search algorithm of complex network, which only costs less search steps and query information to traverse the whole network, so it has better search efficiency. At the same time, we apply this algorithm in the large server’s local file retrieval system in order to improve the search efficiency. Finally, these algorithms are summarized and their deficiencies are presented.
Keywords/Search Tags:intranet, search engine, link filtering, score ranking, complex network
PDF Full Text Request
Related items