Font Size: a A A

A metadata search approach to natural language database query

Posted on:2003-01-16Degree:Ph.DType:Thesis
University:Rensselaer Polytechnic InstituteCandidate:Boonjing, VeeraFull Text:PDF
GTID:2468390011483669Subject:Computer Science
Abstract/Summary:
The research develops a new solution approach using a metadata search to solve the problem of truly natural language database query. While previous results either restrict the syntax of the query (ineffectiveness) or require semantics (inefficiency), this new approach supports any style of articulation, including grammatically incorrect and incomplete ones. It also efficiently determines the answer to the query, with a feedback loop to handle any exceptions. The new approach features a new class of reference dictionary integrating four types of enterprise metadata: enterprise information models, database values, user-words, and cases. The reference dictionary accommodates any possible interpretations of a natural language query concerning enterprise databases and promises to reduce the growth of user-words through enterprise information models. The branch-and-bound search method makes it possible and efficient to search all possible (machine) interpretations of a natural language query and to determine the optimal solution. The approach also provides a case-based learning and a case-based reasoning to assure successful closure to a query and to improve performance. The development of the metadata-search natural language database query (MS-NLDBQ) supports the approach. The results include (1) a new reference dictionary and its graphical representation of natural language queries based on the Metatdatabase model, (2) the core method, the branch-and-bound search method and query generation, to translate natural language queries to simple SQL queries, and (3) the software implementing the core method. Testing results of the software with queries on the computer-integrated manufacturing (CIM) database show that the system is capable of processing truly natural language text inputs, even in the form of a short essay, under certain conditions (necessary and sufficient conditions). The necessary condition is that the text input contains at least one recognized keyword (an entry found in the reference dictionary). The sufficient condition is that the text input contains a complete set of keywords from which, and only from which, a single SQL statement can be constructed to answer the query correctly. The response time and the growth of user-words recorded during the test are all efficient relative to usual SQL query processing. The testing results confirm the practicality of the metadata search approach and provide a good basis for extensions toward developing the capability to answer PL-SQL class queries and queries against non-relational databases. Text-based natural language query capability also promises to be amenable to verbal queries when coupled with voice recognition and synthesis techniques. Future research will include an exploration of the new approach to solve some non-database, traditional natural language interface problems in particular application domains.
Keywords/Search Tags:Natural language, Approach, Metadata search, Query, New, Reference dictionary
Related items