In data mining field, clustering analysis can make desultory information becoming order when the data is belong to logical cluster, forecasting enables people predict thing's development according to the existing information. With the deeply research of data mining, many problems can't be solved with one single method, therefore constructing prediction models by the relationship among the information comprehensively according to the specific data has become the research focus.Based on the statistic data of China Statistical Year Book, it can analyze the relations of data, and realize some data prediction; In the drug research,taking cyclodextrin for an example, it realizes the prediction by using the relationship between cyclodextrins closure constants and cyclodextrins'extracted various attributes. These data's characters are that there are relations among the samples'attributes, and they are suit to use linear and principal component analysis, and the samples can be classified.The paper constructs a new prediction model DRKM-RBF (Dynamic Rough Set K-meansâ€”Radial Basic Function) based on RBF neural network model and optimization method which uses linear forecasting thoughts and rough agglomeration class method. And it demonstrates the effectiveness of the DRKM-RBF model with forecasting road freight quantity and cyclodextrins closure constant. This paper has three innovation points, and the specific works are as follows:(1) It presents a new dynamic clustering algorithm:k-means clustering algorithm based on dynamic rough set. The algorithm combines rough sets and k-means algorithm, and uses the character of rough sets'upper approximation set and lower approximation set, then it achieves the dynamic class number K and proves its superiority through an example.(2) It constructs a new comprehensive continuation matrix as neural network's input/output structure. Considering that the original data has linear relationship, it uses linear analysis and principal component analysis to construct comprehensive continuation matrix in data pressing.(3) It presents the thought improving neural network hidden layer structure with the dynamic clustering results based on rough sets in this paper. When the RBF is used, it makes the result of training sample by rough set clustering as hidden nodes center, and uses the advantage of rough set which can solve fuzzy boundaries among sets, then it activates corresponding clustering center when hidden layer is exported. Besides, it uses comprehensive continuation matrix as RBF neural network's input/output structure, and combines linear forecast and neural network effectively. Based on two thoughts above, the paper constructs the new prediction model:DRKM-RBF predictive model, and prediction precision can be improved greatly.The DRKM-RBF model is used in road freight quantity forecasting, and the data of road freight quantity forecast includes correlation attributes and time-domain factors; It constructs comprehensive continuation matrix with results of principal component analysis to associated factors and results of multivariate linear regression analysis to time factor; The road freight quantity data samples was processed by rough sets dynamic clustering, and the results were made as hidden nodes center of the new model. The new model was used for predicting road freight quantity, the accuracy of the prediction result was greatly improved compared with combination forecast method, direct forecast forecast method, KM-RBF forecast method, PCA-KM-RBF forecast method and so on.The DRKM-RBF model is used in forecasting cyclodextrins closure constant, and the cyclodextrins closure constant data sets include cyclodextrins closure constants and related attributes extracted from cyclodextrins. The purpose of establishing the new model is finding cyclodextrins properties by cyclodextrins closure constant. Firstly, after making principal component analysis and linear forecasting to the related attributes cyclodextrins closure constant, the result is made as neural network's input/output; Secondly, after making dynamic clustering based on rough sets, the clustering center is hidden nodes center of the new model, and the class number is equal to hidden layer node number; then using the new model to predict cyclodextrins closure constants, and the predicted results is got by the test set. The prediction result was satisfactory comparing with methods such as principal component regression method and RBF neural network prediction method. |