Font Size: a A A

The Study And Prototype System Design Of Search Engine Based On Event Driven Model

Posted on:2011-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:X B BaiFull Text:PDF
GTID:2218330371463336Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Since the birth of the Internet, network information at a rapid rate of growth, from the traditional general-purpose search engines to personalized services of search engine, its technology and service concepts are to a great development, but in some fields still have a large room for development, especially in the accuracy of query results, without the user logging and user personalized service access to the random subject of concern which there is still a lot of research and development space, the main purpose of the paper is to solve the four problems: First, no user logging personalized recommendations technology, and then the subject of concern is the user's random access and that the third is concerned about the extent of users of the document. Moreover, improve the accuracy of the search engine returns results.First of all, introduces the traditional search engine and personalized services, search engines work, architecture and key technologies, and personalized filtering of existing defects in the technology itself; Secondly, the study of search engine theory and other related technologies, such as Web crawler works, Chinese word principle, such as mechanical matching segmentation method, the maximum probability segmentation and Lucene Chinese word, and establishing inverted index of principle, retrieval techniques (Boolean logic model, vector space model and probabilistic models ) and page ranking technology (PageRank and authority of the Center page algorithm), to the back of the prototype system to provide a theoretical basis for the design and implementation; Then, the event-driven model based on the major search engine algorithm in detail, presented in the pages based on user Analysis of residence time to open the page the user the degree of concern on the algorithm, a random user in the subject of concern for the future, according to the degree of concern on the page the user algorithm to calculate the user's degree of concern on the current theme, decide whether the asynchronous update information relevant to the subject in order to achieve without the user log records stored in the personalized recommendation technology; In addition, the structure of the document with Html and Chinese habit of writing the article, the weight of the keyword method of calculation is improved with a more accurate expression of the theme of the document features, which can improve search engine inspection precision; the final design model based on event-driven system prototype search engine and use the Java technology and Lucent2.3 prototype of the designed system, there are indexers, search, filter , theme analyzer and timer.By using Heritrix web crawler to build and transform access to network resources and implementation of the system prototype designed to demonstrate the feasibility of the system (subject of concern, including random access and the user opens the document analysis of the degree of concern), and the key to improving term weight calculation algorithm for authentication, which shows that the improved algorithm helps search engines to check the improvement of precision.
Keywords/Search Tags:search engine, event driven model, theme analysis, calculation of similarity
PDF Full Text Request
Related items