Font Size: a A A

Research And Application Of Vertical Search Engine Based On The Depth Of Mining Enterprises

Posted on:2016-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y ZhouFull Text:PDF
GTID:2308330467473362Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of high technology and the popularization of Internetinformation processing, more and more enterprises have for data sharing applications such asinformation processing in the extensive use of Internet technology, which makes the informationcontent is increasing at an alarming rate, while the interior business system of the birth of recurrentand large amount of multi structured data.In the enterprise scale unceasing expansion, the dataaccumulation conditions, the traditional search engine existence information of self processingdefects, user can only be retrieved open web information, the web recall ratio and precision ratio istoo low, to internal web pages cannot be retrieved, cannot meet the needs of the enterprise usersretrieval. Enterprise operation analysis layer, how to bring the right information at the right time toprovide the corresponding decision-making layer and decision layer, how to timely and accuratelyacquire the decision-making information and other necessary will become an urgent andcomplicated subject.Driven by the requirements, a data center appeared type vertical search enginebased on the depth of mining enterprises to build an "integrated marketing platform" the use ofmetadata management technology, and build on the business intelligence, management and rationalutilization of information, brings a new solution to the data service,Therefore, this paper selectedtopic "based on the enterprise depth mining type vertical search engine in the research andapplication of data service, presents the architecture of search engine system for enterprises, and itsapplication to the Zhejiang tobacco industry enterprise search engine system, proposed the Zhejiangtobacco industry enterprise search engine system framework.The main research contents are asfollows:1) summarizes enterprise users demand for search engine system based on tobaccoenterprise marketing status and problems, to determine a metadata management system to build asearch engine for crawling data, will be a large number of data compression, and then thecompressed data and those who do not easily compressed click frequent page storage location andcalculation finally, a series of research according to the customer demand search. 2) a framework of enterprise vertical search engine system. This article mainly from thefocused web crawler, metadata management platform, data compression, cloud storage andcalculation of data query, and the four modules of the system are designed, the emphasis on the dataprocessing of metadata management, data compression, cloud storage and calculation and design ofdata query on.3) proposed architecture Focused Web crawler. This paper presents two models focusedweb crawler, analyses the related algorithm of network reptiles in the industry, and obtain thetobacco industry for extracting data, at the same time, according to the characteristics of the tobaccoindustry climb take data corresponding Webpage search strategy.4) the metadata management module. Metadata management is to monitor the wholesystem maintenance module, through the integration of metadata, data concentration layer, datawarehouse layer and data display layer monitoring and management. Provides a single point ofcontrol functions integrated graphical environment. Create a metadata model is represented usingthe information of the enterprise and the relation between. Integrated management tools andenvironment, including data acquisition, ETL and OLAP data loading. Development for the datawarehouse to provide convenient and use based on the better will be the integration of data and data,put forward a metadata management platform, will play the role of data to better, improve thequality of the data.5) data compression. Data compression can save storage space, text transmission time inthe communication link is reduced, thereby reducing the transportation cost. This analysis of thecompression technology, which is the key, compression technology not used properly, thecompression effect is totally two samples."Information is stored in the compressed data and highfrequency click, introduces the calculation of cloud storage and cloud based on HBase, at the sametime, in order to solve the problem of information organization, facilitate the processing of queriesand positioning, the relevant part of the data collected, the index is the key.6) to the above theory and the usual practice in the laboratory and research based on thesearch engine company, puts forward the type of vertical search engine in data service system of themining enterprises based on depth. This paper takes zhejiang ZhongYan marketing search enginesystem, for example, to verify this system scheme is proposed in this paper.
Keywords/Search Tags:search engine, deep mining, metadata management, compression, cloud storage
PDF Full Text Request
Related items