Font Size: a A A

Research And Achieve Of Distributed Search Engine

Posted on:2015-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:T A ZhouFull Text:PDF
GTID:2298330422477139Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The internet has rapid development with the advances in information technology,itbecome a part of learn and life. Internet contains a lot of information, and it is an importantsource for people to get information. How to quickly get the desired information from theinternet. has become a hot research,search engine technology is generated in this context. Thepager design and achieve a distributed full-text search engine system by use of object-orientedprogram techniques,which have some features,such as multi-threaded,multi-language,adaptiveand so on.The paper analyzes the status of search engine research Firstly,elaborates workflow andarchitecture of search engines,analyze and summarize the current mainstream search engineranking algorithm,it use the idea of page rank algorithm to analyze the core web pageproperties which affect the pages sort,Then it achieve a quality grade evaluation algorithm byanalyzing qualitatively and calculating quantitatively of the core web page properties andachieve text similarity sort algorithm by using the of word weighted of frequency andposition,then achieve comprehensive keyword ranking and information retrieval sorting byusing both algorithms.Then it achieve the collection systems,information analysis systems,information retrievalsystems by using the object-oriented technology.It use factory production management modeland four cache structure,as well as multi-threading technology and clustering analysisalgorithm to achieve collection, extraction, storage information. it extract keyword from textwith thesaurus keyword extraction algorithm based on the word database and extractionalgorithm based on special characters, and establish the keyword inverted index. It store theweb information by using the data points table storage structure of the database,and usethe table of search engine library mechanism to achieve large data storage,and informationretrieval systems by using MVC architecture and retrieval sorting algorithm. Finally, thesearch engine is confirmed achieve the desired goal by the relevant testing.
Keywords/Search Tags:Search engine, Page rank, Web crawler, Segmentation
PDF Full Text Request
Related items