Font Size: a A A

A Study On Data Mining And Its Application In Process Optimization

Posted on:2007-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:B JinFull Text:PDF
GTID:2178360182470784Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Data Mining is one of the promising information techniques in process industry.On one hand, data mining technique is developing very quickly and has found its applications in various areas. Due to the development in technique of data acquisition and database, there is a huge amount of data accumulated in human's activities. So man needs a powerful analysis tool to discover the underlying and interesting rules from these huge databases, which can be used to help us to do things more effectively. Data mining is the solution for this problem. After ten years' development, data mining has found its applications in various areas, from business to entertainment, from science to technology.On the other hand, more and more information systems begin to enter the plants. Due to the application of DCS and real time database, operation data of plants can be recorded in database. This accumulation of process data makes it possible to employ data mining to improve productivity. But process data is very complex, which causes many application problems.This paper tried to give a solution to overcome the problems encountered in industry application of data mining.Firstly, a new algorithm of sequence analysis was proposed: BTS, which is flexible and effective in similarity search. The first feature of industry database is hugeness. This is a challenge for traditional approach of sequence presentation. BTS handles this problem by proposing a new compression approach, which has a strong compression power and the compression rate can be changed easily, that enable us to view the whole sequence or part of it in different compression rates. Beyond this, BTS introduced BTR, which is a fast similarity model, an effective index at the same time. These reduced the search time significantly and saved the time of index building.Analysis of operation data showed that there was frequent change of control variables, so it's a chance that data mining may find useful rules from changes. Firstly the traditional algorithms were applied to the database. They told us that there were so strong and complex relationships that we couldn't get any interesting rules.This problem was solved through a new framework of data mining: Decision Forest. In order to deal with time delay, Decision Forest adopts relational database as a tool to regroup all the delayed variables. The second problem is the relationship between variables. Based on process model, Decision Forest sets up a series of decision trees to simulate dependences between variables. These trees will give a final decision through voting.This solution has been proved to be effective in a triazophos plant.
Keywords/Search Tags:Data Mining, Similarity Search, Decision Tree, Time Delay, Variables Dependence
PDF Full Text Request
Related items