Font Size: a A A

Case Base Maintenance Based On Data Mining

Posted on:2004-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiFull Text:PDF
GTID:2168360095956165Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an effective technique to nontrivially extract previously unknown and potentially useful information from very large amount of data, Data Mining and relative algorithms have been on broad research recently and got a wide range of applications. With the proliferation of Case-Based Reasoning (CBR) system in organizational knowledge management, many case bases are now becoming an unwieldy legacy system, which has called for the intention of case base maintenance. It focuses on adoption of proper techniques and policies to improve case quality, boost access performance, and increase the efficiency and competence of the CBR systems.Therefore, with the background of real-world requirement, the thesis details on how to maintain the case bases from the viewpoint of case quality and access performance by applying Data Mining in case bases and case access logs.To support Data Mining in case bases, the thesis proposes a case representation based on traditional object-oriented representation. A case is represented by a weighted feather vector. With this representation, an improved algorithm for case similarity measurement is brought up here. It is used in the Data Mining algorithms for modeling the case feathers. Further more, the content and representation of case access logs are also addressed here. The implicit dynamic case access models are studied from viewpoints of access transactions and access sequences. All of above are the base for case base maintenance techniques.The research of case base maintenance techniques in this thesis is carried out from two research perspectives: content maintenance and performance maintenance. The underlying technique is Data Mining in case bases and case access logs. The purpose of content maintence is to improve the quality of cases. In this part, such techniques are proposed as finding inconsistent cases based on outlier detection algorithm, fulfilling incomplete cases based on classification algorithm, detecting redundant cases based on clustering algorithm, and discovering Spam cases by performing trend movement analysis on case base access logs. The purpose of performance maintenance is to improve the access velocity. In this part, the following techniques are specified: caching frequently accessed cases, partitioning the case base to limit the accessible case number for every CBR circle, and pre-fetching cases that are usually accessed together in the CBR circles. As for every technique, its application, realization, algorithm and effectiveness are analyzed. And the maintenance policy for the solutions based on these algorithms is also brought out.The case base maintenance problems in the CBR systems are solved by Data Mining techniques in this thesis. But the methods and policies discussed in this thesis are notconfind to CBR systems. It is hoped the discussion here could be helpful for the maintenance of knowledge bases in knowledge management systems.
Keywords/Search Tags:Data Mining, Case-Based Reasoning, Case base maintenance
PDF Full Text Request
Related items