Font Size: a A A

Research Of Rough Set Theory Prototype System Based On ORDBMS

Posted on:2009-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:K LiFull Text:PDF
GTID:2178360242481598Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the global information and development,businesses are being faced with the more impact of mass information.Because of the growing wealth of information, People more and moreneed powerful data analysis tools to understand and control usefulinformation. Therefore a new field of study is rising - Data Mining.Data Mining is aim to tap the hidden data patterns, trends andrelationships. Although the defined of formally concept of DataMiningis onlyinashort time,thecoretechnologyofDataMininghasbeenappliedtomanyareaofenterpriseinformationfordecades.The information of real world is always fuzzy issues, the dealingof traditional math method to these fuzzyquestions is not satisfactory,and even blindness, and undesirable. Later, the Polish scientists Z.Pawlak, in 1982, released a new mathematical theory - Rough Set(Rough Set). Rough set theory is a mathematical tool which is aim tostudy imprecise, uncertainties knowledge, and it has a strongadvantage in the information fuzziness processing. In recent years, therough set theory has won widely applications, and the relationshipwiththedataminingismoreandmoreclosed.the main research of this paper is integrated the rough set theoryinto ORDBMS prototype system,utilizing the expansion ofequivalence matrix to search the upper approximate sets, lowerapproximate sets and relative core in decision-making table; theoperations,basedontheexistalgorithm,hasbeenrealizedandanalysisalgorithms, and further improve and refine the exist algorithm; finally adoptedUCIMachine Learningtolearnthedata sets andexperimentalverification in the database, complete the analysis and comparison totheperformanceofthealgorithm.ThePaper'scontentsareasfollows:⑴dataminingandroughsetofresearch.Summary of mainly include: the concept of data mining, and thehistorical background; introduce the advantage of rough set on theinformation processing of fuzzy theory ,expect the prospects of roughsettheoryinthedevelopmentofdatamining.⑵theoutlineoftheoriginalRoughSettheoryIntroducethe backgroundanddevelopment historyof theoriginalRough Set theory; give the Rough Set theory's basic principle, thefield it can deal with. The paper also give the definition of some basicnotions whichRoughSet theoryinvolved,such as: equivalent relation,indiscernibility relation, knowledge, upper and lower approximate,rough set, information system, decision table, reduction, core and soon.⑶researchaboutRoughSetintegratingandORDBMSStudy and research the concept of binary equivalent matrix, andgive the algorithm of the binary equivalent matrix which is based onthe concept. Describe the expansion-equivalent matrix algebra andalgorithms in details, Such as: algorithms about upper and lowerapproximate, negative domain and domain boundaries of sets, domainofthesystem,algorithmsofcoreandsoon.DescribethealgorithmsofalgorithmscorebaseonSQLlanguage.⑷implementationoftheprototypesystem: This is the main part of the paper which use the ORACLEdatabase and the PL / SQL language to implement the prototypesystem.Thoughitbaseontheexistingresearchofupperapproximationalgorithm ,furthermore ,it achieve further improvement in the paper.The large data sets of UCI learning machine has been used tosimulationexperimentsandanalyzetheresults.Background of technical in Prototype SystemORDBMS is amanagement system whose base is object–relationthat oracle support. ORACLQ10g provided the support of ADT andUDF.UsingPL/SQLcaneasilyachievematrixalgebraandalgorithmsmentioned in the paper. Define composite data types in the PL / SQLblock: PL/SQL record and nested table. The record is used toimplementation comparison operation between the data in the table.And the nested table is use for storing binary equivalentmatrix.Besides, there are all types of branch conditional statementsand loop statements.Using them can achieve the operation aboutequivalentmatrixandarrays.ResearchtothealgorithmDuring the studyand research to the algorithm of finding a roughupper approximation set, it has been found a certain limitations. Basedon the original definition and exist proposed algorithm, the theoreticalbasis of algorithm has been discussed in detail for the limitations.Based on the upper approximate set original definition, improving theexist algorithm,andultimatelyreleaseanewupperapproximateset onthe algorithm, using several large data sets of the two algorithms to verify.Analysis of the experimentalresultsUse the UCI Machine Learning to learn the large data sets ofballoons, credit in the database. Make simulations and experiments tothe relative core of the two algorithms. Study the experimental resultsandcomparethecapabilityoftheperformanceofthetwoalgorithms.Based on the work of the exist algorithms, this paper achieves aprototype system, the prototype system implements upper and lowerapproximate sets of decision-making table, rule extraction, the relativereduction and relative core operations using equivalence matrix. Inthe study, the algorithms have been analyzed, and further perfect theimplement, made the algorithm of upper approximate sets moreaccurate. Used the machine learning databases of large data sets toverify the conclusion, and make a simple analysis of systemperformanceIn the prototype system, a technique is adopted which combinesPL/SQL language with the equivalent matrix to make algorithmrealizationeasy.A technique is adopted combine the SQL language with corecomputation, make full use of the SQL language features, to make thecore computation fast. And the greater volume of data, the moreobviouseffects.
Keywords/Search Tags:Prototype
PDF Full Text Request
Related items