The Key Technologies Of Search Engines And Implementation

Posted on:2009-11-17

Degree:Master

Type:Thesis

Country:China

Candidate:Z G Wang

Full Text:PDF

GTID:2208360272459190

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of computer industry, There is increasing information stored into computer storage devices. Bascally, These data can be classified into two main categroies, and they are structured data and unstructured data. The examples of structured data are enterprise financial accounts, producing data, student score data and so on. The unstructured data contains some text data, image data and sound data, etc. According to statistical analysis, unstructured data occupies more than 80% of the whole amount of information in the world. As for structured data,it is a best way to use RDBMS to maintain it. but RDBMS has some birth defects when it is used to manage a great amount of unstructured data, especially, the answering time to is unbearable when users query these unstructured data. The reason to cause this flaw exists in RDBMS'S understratum structure. Through full-text retrieval technology, we can manage these unstructured data efficiently. By the development of these years, the full-text retrieval technology evolves to be a powful software which integratedly manages unstructured data which ranges from the primitive strings to new unstructured data, such as hug text, voice, images, active movies, etc.Essentially, the search engine technology is a major application of the full-text retrieval technology. Currently, The use of search engine has become the second most population on internet after E-mail system. Search engine comes from traditional full-text retrieval theory.it is that a designed computer procedure builds the words index information and stores them into a inverted file by scaning each word in each page. then, the search procedure check the inverted file to found the pages which match the keyword, rank the matching pages according to the frequency and probability that the keyword appears in each matching pages, and outputs the sorted results. The full-text retrieval technology is the core supporting techonlogy of search engine.Based on an excellent full-text retrieval model-IRST(Inter-Relevant Successive Tree), in this thesis. we research the combination between Inter-Relevant Successive Tree mode and search engine technology and the implementation of key technologies about search engine. In the process, we mainly focus on three topics which are match-degree computing, the associating query between search engine and RDBMS and rank technology. we propose two unified formulas to compute match-degree which not only concise the process of caculating , but also concern all possible cases of matching. By importing the concept and technology of memory database, we sucessfully implement the association query between search engine and RDBMS, which make the users get their real need more effciently, conviently, and quickly. In the end, we propose and implement a dynamic partition and multi-values sorting algorithm, which improves the sorting efficiency by reducing the unnecessary operations of sorting, just extracts the needing page data and ranks the page data. The combination of Inter-Relevant Successive Tree mode and search engine technology makes the Inter-Relevant Successive Tree mode to be a new kind of method and theory in search field.

Keywords/Search Tags:

Search Engine, Full-Text Retrieval, IRST(Inter-Relevant Successive Tree)

PDF Full Text Request

Related items

1	Irst Index Improve The Research And Application
2	The Research And Application Of Full-Text Search System Based On Lucene
3	Distributed Based On The Search Engine Irst Improvements
4	Improvement Research On Inter-Relevant Successive Trees Model
5	Research On Full-Text Index Model Based On Full-Text Database
6	A Mathematical Expression Retrieval Model Based On Inter-relevant Successive Tree
7	The Research Of Word Index Method Based On Inter-Relevant Successive Trees Model
8	Research And Implementation Of Full-text Retrieval Combining Word Matching And Context Interaction
9	Research And Implementation Of A Chinese Full-Text Information Retrieval Technology Based-on Lucene Search Engine
10	Research On Indexing Model Of Retrieval System And Retrieval-related Technologies