| With the rapid development of my country’s photovoltaic manufacturing industry,polycrystalline silicon cells have always occupied a dominant position in the photovoltaic market by virtue of their high cost performance.The minority carrier lifetime value of polysilicon is a key factor to ensure the performance of polysilicon cells,and the minority carrier lifetime value mainly depends on the polysilicon production process and the types of ingredients.At present,our country has a relatively mature polysilicon production process,so it is the types of ingredients used in production that have a greater impact on the final lifetime value of minority births.Effective and reasonable types of ingredients can achieve the purpose of saving costs and improving the lifetime value of minority births.However,as the amount of data continues to increase,the processing of polysilicon batching data is often accompanied by huge redundant attributes and time consumption.Therefore,how to quickly and effectively obtain the potential information in the data has become a hot spot of current research.Rough set is an effective data information processing tool,which can efficiently extract the information we need from massive data,and has been widely used in the industrial field.However,traditional rough sets can only process discrete data,and cannot directly process continuous polysilicon batching data.To solve this problem,the neighborhood rough set is proposed,which can be directly used for continuous data without discretization.In the traditional neighborhood rough set analysis of industrial data,there are still problems such as inaccurate calculation results and high algorithm complexity.In order to improve the quality of attribute reduction and reduce the running time of the algorithm,this article improves it from the following four aspects:(1)The traditional neighborhood rough set only processes the conditional attributes into neighborhood granulation.For the decision attributes of continuous data,they are still treated as equivalent categories,which will cause over-fitting.In this paper,the concept of neighborhood granulation is introduced into decision attributes.By dividing each decision point into a neighborhood,all points in each area are regarded as a category for calculation.This not only solves the over-fitting problem,but also improves the reliability of the calculation results.(2)Because the traditional neighborhood rough set relies on the strict inclusion relationship for the judgment of the lower approximation,the attribute reduction result is too dependent on the neighborhood divided by the neighborhood radius,which causes all the correctly classified neighborhoods to not all contribute to the decision-making..To solve this problem,based on the cross-relationship,a new lower approximation calculation method is proposed and defined as the lower approximation contribution degree,which improves the fault tolerance of the lower approximation calculation.This paper uses 4 sets of UCI data and2 sets of polysilicon batching data to conduct experiments.The results show that the new lower approximation calculation method can obtain higher neighborhood approximation quality and improve the reliability of attribute reduction.(3)The importance weight threshold of the traditional neighborhood rough set is generally a fixed value,which makes certain attributes have low importance,but the attributes whose importance weight is greater than the fixed threshold are classified as core attributes.And these low-importance nuclear attributes have almost no influence on the division of decision-making.To solve this problem,an adaptive-β importance weight threshold method is proposed.Based on the attribute importance calculation results,the weight of each attribute is arranged from large to small,and the smallest attribute weight is first taken as the importance weight threshold.The evaluation function calculates its evaluation value,and so on,and finally selects the threshold with the highest evaluation value as the final result.Experiments on 4 sets of UCI and 2 sets of polysilicon batching data show that this method can obtain higher reduction rate and classification accuracy,and improve the accuracy of industrial data analysis.(4)Traditional neighborhood rough set neighborhood radius is generally obtained by empirical value or repeated experiments,which reduces the degree of automation of industrial data analysis.In this paper,the neighborhood rough sets-support vector machines(Neighborhood rough sets-support vector machines,NRS-SVM)model is used to reduce and predict the attributes of polysilicon data.Aiming at the problem of the neighborhood radius δand SVM parameter values in the continuous data processing polysilicon ingot batching of this model,a two-stage genetic algorithm combining the NRS-SVM model and the genetic algorithm(GA)is proposed.In the first stage of the algorithm,a better reduction set is obtained by searching for a new neighborhood radius.In the second stage,the first stage attribute reduction results are used to train a classification model with higher accuracy by searching for new SVM parameters.According to the purpose of each stage,the algorithm proposes the corresponding fitness function and termination condition.The salient feature of this method is that it realizes the automatic feature extraction and classification prediction of NRS-SVM,and performs the two stages separately,avoiding the use of the classifier.To evaluate the time consumption brought by the reduction performance.Experiments on the polysilicon ingot batching data set show that compared with the one-stage genetic algorithm,the method has a shorter running time,stable output results,fewer features and higher classification accuracy. |