Font Size: a A A

The Establishment Of Intelligent Search Engine Platform In Enterprise

Posted on:2015-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:M P WangFull Text:PDF
GTID:2268330428464984Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The character of general search engines decides it can not meet the specific area’s andpeople’s in the needs of accurate information retrieval services. Along with the rapiddevelopment of enterprise informatization, tailor making a enterprise-lever search engine forcompanies has become a demand, which will also become a research hotspots in the InformationAge. Compared with the general search engine, enterprise search engine has characters of richercollection content, higher safety factor, as well as recall and precision ratio, and so on.Data-oriented by enterprise search engine includes not only the Internet web page data, but alsothe enterprise’s internal database and the industry application system’s business data, while thebusiness data is transparent to the general search engines, so the enterprise search engine has ahigher threshold for data acquisition, whose user groups tend to be looking for informationwithin the industry, leading to a higher accuracy demand. This paper selects the issue of"Establishment of Intelligent Search Engine Platform in Enterprise", presents a search enginearchitecture for enterprise group, combining with the enterprise search engine system needs ofZhejiang tobacco industry company, provides the enterprise search engine system frameworkand designs a system implementing a vertical search engine based on tobacco industry forZhejiang tobacco industry company. This papers’ specific contents are as follows:1) Four systems in search engine: download system, analysis system, index system andquery system. On the basis of the specificity of tobacco industry, combining with the tobaccoindustry knowledge, study the Internet search engine principles in depth, than determine thetobacco industry’s requirement about search engine and the function enterprise vertical searchengine should implement.2) Propose the focused web crawler architecture of enterprise search engine system. Thispaper analyzes the related algorithms in the industry Web crawling, and provides the tobaccoindustry’s webpage revisiting strategies. When storing the website information, introduce cloudstorage solutions based on Mongo DB to establish Webpage Library and gives the enterpriseinternal data Extraction scheme.3) Enterprise search engine data processing and analysis system. This paper propose aWebpage check model based on tobacco industry, and on the study and designation of the dataprocessing module, this paper proposes to build enterprise search ontology library. 4) Enterprise search engine query system. Combining with full-text search algorithms, andthe PageRank algorithm, propose the "tobacco theme" algorithm improvements, and through theanalysis of query log, the study and application of user queries intention supposition, provideenterprise search engine theory.5) Base on the above theoretical knowledge and the research of enterprise search engine inthe Zhejiang Sci-Tech University Business Intelligence Laboratory, this paper designs theenterprise search engine system and proposes its system architecture. The system will be appliedto the development of Zhejiang tobacco industry company enterprise search engine which theauthor participates in. Propose the search engine system diagram of Zhejiang Tobacco, and totest the system solution proposed in this paper with the achievement results of Zhejiang Tobacco‘s search engine.
Keywords/Search Tags:Search Engine, Enterprise, Web crawler, Web library, PageRank, Word Analysis, Index, Query log
PDF Full Text Request
Related items