In the age of Big Data As data sets continue to swell,it is critical to select the best subset of features from the original data set to obtain the best performance in machine learning tasks.The feature selection is used as a method of data pre-processing.It can filter redundant information from the original data set,and screen the efficient and useful features,thereby improving the efficiency of data processing and the ability to screen features.Feature selection based on rough set,as a prevalent dimension reduction method,can effectively extract features whose discrimination ability is close to or even higher than the original feature,and has been widely used in machine learning and data mining.The traditional feature selection method based on the neighborhood rough set can directly process the numerical attributes.However,when calculating the neighborhood,the internal knowledge of different rights and decisions is not used to be ignored The efficiency is low in the process,which reduces the classification accuracy.In addition,the existing rough collection-based online streaming feature selection algorithm ignores the correlation between features and lacks effective dynamic update mechanisms.Therefore,in order to solve these problems,the feature selection method based on neighborhood rough set is studied deeply in this paper.The main contributions are as follows:(1)In response to traditional features,the impact of different rights value features on decision-making and low efficiency has been introduced,and weight calculations have been introduced.This article uses the weight of the information amount of each feature to calculate the weight of the characteristics,and then build a rough set model that weighs the neighborhood of similar neighborhoods.Based on the rough sets of neighborhoods that are similar to the weighted similar domain can be described in qualitative and quantitative relationships between the ordered objects.Based on the effective combination of this model and the PSO algorithm,the new adaptation function is defined,and the optimization algorithm based on the characteristics of the rough set of neighborhoods based on the weighted similar neighborhood is proposed.(2)Aiming at the characteristics of traditional neighborhood characteristics,there is a characteristic problem with strong loss of characteristics,and the correlation between characteristics and characteristics have been introduced.Combining the correlation between the characteristics and labels,characteristics and characteristics of the new definition,obtain a new neighborhood collection,build a related neighborhood rough set model,and propose a rough set based on the correlation neighborhood.Online importance analysis and online redundancy analysis feature selection algorithm.(3)A disease prediction system is built based on the feature selection algorithm of neighborhood rough set.The system uses the feature selection algorithm proposed above to assist doctors in diagnosis and treatment,and further enhance the capacity of medical services. |