Font Size: a A A

Key Technology Research And Implementation Of Vertical Search Engine

Posted on:2015-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:F Y YuanFull Text:PDF
GTID:2308330473950469Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, users’ demand for information search keeps rising, especially in the vertical field of search engines, core technology includes multi-layer sorting search, intelligent search, keyword association, automatic information extraction and keyword highlight and so on.Target to address the problem, this paper uses music search application as the main research object, based on in-depth analysis of current open source search engine,combined with personalized demands for vertical field search engine, carries out detailed research and development on the vertical field search engine algorithm. The main contents are:Through the research on open source search engine lucene, re-design and optimization its core technology algorithms, developed a set of universal vertical search engine framework.Solve the problem of single layer sort mode of search results. Vertical search engine has higher requirement of search results sort mode than ordinary comprehensive search engine, which require more accurate search results, and multi-layers sort reference. A multi-layers sort algorithm is developed in this paper to solve the defects of current search engine’s single layer sort.Solve the problem of insufficient search depth Current search engines generally use keyword matching to obtain text content information containing specific search keywords. Such search can only do simple character matching. Base on the two-dimensional scoring calculation algorithm, intelligent attributes are established in this paper to solve the deep search problems.Solve the problem of insufficient intelligent data processing. Currently when extracting information from webpage, approaches likes writing regular expressions or configuring webpage templates are generally used. Since search engine needs to process vast amounts of data from the whole network in real time, it’s not practical of relying on human to write so many regular expressions and templates. Based on the rough set algorithm, multi-dimensional constraint data extraction method is established in this paper to achieve intelligent content extraction of news web pages.Keywords association algorithm is designed and implemented. At present, searchengines usually provide keyword association function for users to make input search keywords convenient. Designed a search engine keyword prompt generation and updating method in this paper, which has high keyword association efficiency and multiple strategies.Keywords highlight algorithm is designed and implemented. Invented a key information display method, especially for highlighting keywords of super-long text,implemented a keywords quick highlighting function based on hash query algorithm technology.
Keywords/Search Tags:search engine, multi-dimensional sorting, intelligent attributes, keywords association, keywords highlight
PDF Full Text Request
Related items