Font Size: a A A

Application And Research On Data Mining In Intelligent Question Answering System

Posted on:2008-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiuFull Text:PDF
GTID:2178360242960396Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Data warehouse and data mining is one of the most active parts in research,developing,and application of database. Data mining means a process of nontrivial extraction of implicit,previously unknown and potentially useful information from data in database or data warehouse. Data mining provides good technology support for data warehouse-based decision support system, and data mining tools can directly mining in data warehouse for discovering potential knowledge. The paper applies the data mining algorithm into the QA system, puts forward a set of scheme about question answering system based on data mining algorithm and realize it. The aim of the scheme is to give up some defects of current question answering system and get an efficiency QA system.At first, the appearance of the data warehouse and data mining technique is reviewed in brief. Then this dissertation studies architecture structure and running process of data mining and data warehouse. All of the above become the basis for this dissertation. The second, the association rule algorithm and text clustering algorithm are expatiated,improved and used to design the data warehouse of QA system. The general thought: An improved association rules algorithm based on keywords is applied to calculate the correlation value between words in order to get the similarity of questions. The best answer can be found by the max similarity value. We can get the one to one QA pairs. Then the text clustering is performed on the QA pairs. The questions would be saved by classify. By using the association rules algorithm into the every class after text clustering, the more accurate association table for extracting the better answers from the database can be gotten, and the similarity could be improved, By this way a comprehensive and accuracy QA database can finally be formed that can be used to data mining. The third, the similarity based on words association value is used to answer the question that the users asked, and an intelligent QA system could be gotten.Experiments show that the method based on the data mining algorithm can improve the answering precision and the questions in the database are classified by text clustering. So the system has merits such as intelligence, continuous self-renewing ability, saving store space and improving executive efficiency etc.
Keywords/Search Tags:data warehouse, data mining, association rules, correlation analysis, text clustering
PDF Full Text Request
Related items