Font Size: a A A

Digital Organisms Database Information Retrieval And Implementation

Posted on:2009-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhuFull Text:PDF
GTID:2208360245461378Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Although both database and information retrieval systems focus on searching data, their methods to solve the problem are very different. Database systems search structured data with complex query languages. Its results are sound and complete, and all the results are equally good. Information retrieval systems search unstructured data by keywords. Its results are usually imprecise and incomplete, and some results are more relevant than others. Keyword search over relational database (KSORD) allows its user to issue keyword queries without any knowledge of the database schema or of SQL. Therefore, the 8010 Lab develops the keyword retrieval system over digital organism database system (DOSSQL).All database retrieval systems have two modules: preprocessing module and query module. Query module is in charge of query processing. At first step, the query module parses user's input to find query keywords and query semantics. Then, the query module executes the search algorithm to generate results. The search algorithm is the key point of the retrieval system. At the same time, the search result must satisfy integrity and non-redundancy.After the fully analysis of the existed keyword retrieval system, MySQL and DOSSQL, this thesis has designed and implemented the search algorithm. It includes searching index, generating tuple set graph, enumerating result trees, and executing the SQL query. The tuple graph is generated according to database's schema and the keyword index, reflecting the correlation and the position of the keywords in the Database. Through adopting the double-layer structure, taking advantage of the attribute of database structure and refining the index information, most of the non-useful data has been reduced. Traversing the tuple graph can generate all result trees with less redundancy. Finally an SQL statement is produced for each result tree and these statements are passed to the DBMS. The DBMS returns the joining of tuples, which are solutions to the problem. The software implementation, with the assistance of the modularization and layering design, makes the algorithm more flexible and expansible. The retrieval system uses ODBC to communicate with database server, which guarantees the independence of the system.Tests point out the influence of the different parameters. The result shows that the system achieves the functional and the performance requirements.
Keywords/Search Tags:DOSSQL, Relational database, Keyword search, Search Algorithm
PDF Full Text Request
Related items