Font Size: a A A

Equation discovery in databases from engineering

Posted on:2000-01-23Degree:Ph.DType:Dissertation
University:University of KansasCandidate:Zhang, LiyeFull Text:PDF
GTID:1468390014960814Subject:Engineering
Abstract/Summary:
As the quantity of electronically generated engineering data grows rapidly, building computer systems to analyze data automatically and intelligently becomes increasingly important to engineers. The overall process of extracting useable knowledge from electronically stored data is called knowledge discovery in databases. The part of the process where patterns are extracted or models are built is referred to as data mining.; This dissertation proposes a data mining method that combines machine learning and regression to help engineers in acquiring knowledge which is preferably expressed as equations. A teaming algorithm based on the method has been implemented in the computer system EDDE (Equation Discovery in Databases from Engineering). In addition, to obtain useful models that are understandable to engineers, knowledge specific to the particular problem area is incorporated into EDDE to guide the discovery process. The role of this domain knowledge is investigated.; The system EDDE is extensively tested on both synthetic data sets and actual engineering data sets. The tests on synthetic data show that EDDE has some important features, such as not being sensitive to the number of variables in data sets. When compared to other methods (regression tree CART, instances based IBL, multivariate linear regression, model tree M5, neural nets, and combinations of these methods), EDDE generates a smaller size model with lower prediction error. EDDE thus summarizes the data more concisely and describes the data better.; EDDE has been used to analyze actual data sets from civil engineering (duration of construction activities, development/splice length of reinforcing bars, and effect of constraint on fracture toughness), chemical engineering (dissolution of ionizable drugs), and mechanical engineering (automobile fuel consumption). These applications show EDDE's important feature of encoding only general engineering domain knowledge in the algorithm and leaving the specific domain knowledge to be provided to the system when the system is applied so that EDDE can be applied in a variety of engineering domains. In addition, they also demonstrate the importance of the interaction between the system and the users in finding useable and understandable knowledge that is consistent with the existing domain knowledge.
Keywords/Search Tags:Data, Engineering, System, Domain knowledge, EDDE, Discovery
Related items