Font Size: a A A

Research On Case-based Reasoning & Text Information Processing On Internet

Posted on:2007-04-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:H T GengFull Text:PDF
GTID:1118360185951354Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nowadays the basic research and exploitation of the case-based system have been gotten more and more attention. Case-Based Reasoning (CBR) is a general paradigm for problem solving based on the recall and reuse of specific experiences. CBR is not only a psychological theory for human knowledge, but will be a new cornerstone of the intelligent computer system technology. The case-based system has been adopted in more and more application fields, especially the fields with ill-defined and incomplete expert knowledge. However case-based system also faced with the same knowledge acquisition bottleneck as other expert systems. On the other hand, Data Mining (DM) is a process of discovering knowledge from the database, which is one of the most effective means to solve this kind of problem. So it is an important research field that Data Mining is applied to case-based reasoning system.As the rapid development and prevalence of Internet, size of text resource swells at a speed of geometric progression. Effectively gaining the massive unknown and useful information in the web text resources is becoming the hot research spot which is paid attention by the researchers. Web text disposal and understanding are also the fundamental work in the field of web text information processing. So the research on text processing should promote rapidly the computer ability to deal with the large amount of text.Along with the continuous development of the artificial intelligence technology, in order to enhance the computer ability of understanding Internet text information, the expert system technologys are introduced to the Internet text information processing.In this dissertation, the research focusing on the answering to the above questions are deeply summarized as follows:Firstly, to solve the knowledge acquisition and maintenance, Data Mining is introduced to the case-based reasoning system. Main works are summarized as follows: 1) a new method of case-base automatic setup based on competence is suggested, which absorbs merits of NCL_CLARA clustering algorithm and footprint data. 2) A new algorithm on unsupervised Clustering analysis is proposed. This new clustering method absorbs merits of the selection-based CLARA clustering algorithm and the new NCL clustering algorithm, which overcomes the low efficiency of traditional case retrieval algorithms when CB gets very large. 3) A new case-deletion strategy based on clustering algorithms is proposed. Those works can enhance the efficiency and practicality of CBR system. Moreover, a new method of data spot-checking based on outliers mining is given, which overcomes the lack of validity using traditional data spot-checking method and ensures the correctness of data spot-checking.Secondly, this dissertation presents a fundamental research on automatic text keyphrase extraction and document automatic summarization in the web text information processing. Detailed works are summarized as follows: 1) a novel automatic text keyphrase extraction method based on word co-occurrence is put forward on the basis of the research of existing keyphrase extraction methods. The method based on word frequency statistics utilizes text...
Keywords/Search Tags:Case-based Reasoning, Data Mining, Clustering, WWW, Text Information Processing, Data Spot-Checking
PDF Full Text Request
Related items