Font Size: a A A

An end-user development system for Bayesian decision support systems that estimate probabilities from databases

Posted on:1997-08-14Degree:Ph.DType:Dissertation
University:Illinois Institute of TechnologyCandidate:Chang, Li-JenFull Text:PDF
GTID:1468390014483566Subject:Artificial Intelligence
Abstract/Summary:
This research has three purposes. One is to build a software system that allows the end user to develop a Bayesian decision support system that estimates probabilities from the actual data. The second is to use this system and a data set to build a knowledge base and then test the knowledge base and the system's Bayesian inference engine on the same data. The third is to compare the classification performance between the proper Bayesian approach and the simple Bayesian approach.;The major problem of applying Bayes' Theorem to solve real-world problems is that substantial number of probabilities are needed, but these probabilities are hard to obtain. Although subjective probabilities have been used as an alternative, they are almost always biased and give unsatisfactory results. Research has shown that estimating probabilities from the actual data objectively is not only feasible but also helps people make better decisions. Recent computer technologies such as Online Analytical Processing, Data Warehousing, Data Mining, and Knowledge Discovery in Databases make probability estimation from actual data easier and better. However, there are some technical issues such as data storage, database accessibility, availability of the probability acquisition system, the Bayesian inference engine, and the explanation facility. These make the process of converting the raw data into knowledge for Bayesian inference difficult, inefficient, and often unsuccessful. Existing software systems also do not provide the environment for the end user to implement Bayesian inference for their classification problems.;We have developed methods for data modeling, user interfaces, a new knowledge representation scheme called the E-F pattern, knowledge acquisition, knowledge retrieval, mathematical and textual explanation approaches, proper and simple Bayesian approaches, and data visualization. Two data sets are used to build knowledge bases. One data set is used to analyze and compare the error rates of the two Bayesian approaches. These methods have been used to build and test a working data mining system.;We have also used our system to explore the difference between simple and proper Bayesian classification. Our proper Bayesian system showed an error rate of only 10% compared to 50% for the simple Bayesian classification.
Keywords/Search Tags:System, Bayesian, Data, User, Probabilities, Classification, Build
Related items