There are mainly two modules of Deep Web data integration framework:the creation of integrated interface and the disposal of interface from integrated searching interface.And every module is also could be divided into several sub-modules.The creation of integrated searching interface contains four modules:The detection of Web Data Base,The abstraction of searching interface schema,The classification of Web Data Base,The integration of searching interface.The main purpose of interface schema extreation is for classifying the Web database and integration of searching interface in the next step,and its responsibility is listing all the atrributes systematically according to a certain requirement.Then,put out the results in order to prepare for the next step.In the same time,we also get a searching ability of this searching interface.Therefor,it is very important to extract searching interface.The schema extraction of searching interface could go with a swing is on premise of getting the right Web page,that means the file contains the searching interface.So, it is important to determine the file whether it contains a searching interface or not.Classification is a important item of data mining,aiming to build a classification function or a classification module.There are three steps:first,phase of training module.Second,phase of evaluation module.Third,phase of classification.Building a decision tree follows steps below:1.chose the important attributes which could stand for the sample,and make sure the value of every attribute.2.chose the sample which has strong classified ability to be the decision node in the current set.3.divide the current set into some sum-set,according to the different value of decision node.4.repeat step 2 and 3,until the set satisfy one of the three followings: First,all the types are the same class.Second,all the attributes have been finished,no choice.Third,all the value of attributes are same to each other.Schema extraction is aiming at files containing searching interface.And most searching interface codes are locked in |