Font Size: a A A

A usefulness metric and its application to decision tree-based classification

Posted on:2001-04-19Degree:Ph.DType:Dissertation
University:DePaul University, School of Computer Science, Telecommunications, and Information SystemsCandidate:St.Clair, Caroline MariaFull Text:PDF
GTID:1468390014453560Subject:Computer Science
Abstract/Summary:
Data mining, a new area of research in computer science, combines various algorithms and techniques used in the established fields of database systems, statistics and artificial intelligence. Data mining is concerned with the extraction of information from generally large volumes of data. Data mining has grown out of industry's need to make better use of the vast amounts of stored data accumulated over the years. Users are demanding software that is more sophisticated and able to answer queries that are unable to be answered by more traditional query methods. In order to meet this demand, researchers are focusing their attention on improving data mining extraction methods. Extracting information using the traditional methods of SQL or statistical analysis are not powerful enough to answer a general query, such as, "What determines whether a graduate student will complete the program?" The goal of data mining extraction methods is to answer these general types of queries.; In the past, the primary focus has been to improve the correctness of extraction algorithms. The more correct the results, the more successful the algorithm. Although this is important for industry, measuring correctness is no longer sufficient for measuring success. Industry has placed an additional demand on the information; the information must be useful. The intent of data mining is to provide a user with information that will help them do their job better. The information or results of the data mining algorithm must be analyzed to determine if this goal has been achieved. This analysis cannot be limited to measuring correctness; it must also measure the information's usefulness. This is not an easy task; that which is useful to one user may not be useful to another.; A usefulness metric will be presented, which will incorporate both objective and subjective measures in order to measure usefulness. It will be shown that this metric can be tailored for an individual's needs. It will also be shown how a decision tree based classifier, one type of data mining extraction algorithm, can be adapted to use the usefulness metric in order to be more successful. By using the usefulness metric, the algorithm will look for and extract useful information. The metric will also be applied to the resulting rule sets of decision tree based classifiers in order to determine their usefulness.
Keywords/Search Tags:Usefulness, Data mining, Decision, Order, Algorithm
Related items