Font Size: a A A

Pseudo-Label Strategy For Generating Granular Balls And Its Application In Searching Granular Ball Rough Set Based Reduction Quickly

Posted on:2024-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ChenFull Text:PDF
GTID:2568307157952779Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a classic method of data mining,attribute reduction can obtain a minimum subset of attributes that satisfy given conditions by eliminating redundant or irrelevant attributes and preserving necessary or relevant attributes.This can help us reduce data size,decease data noise and improve data analysis efficiency.As a classical method of attribute reduction,granular ball rough set overcomes the defect that huge time consumption caused by the neighborhood rough set model using the grid search strategy to find the optimal neighborhood radius,where there are many instances of sample inconsistency in the data set,it becomes difficult to generate granular balls with high purity in the process of computing granular balls,thus decreasing the efficiency of the granular ball computing learning model,the process of solving reduction based on granular ball rough set can also be extremely time-consuming.In order to overcome these limitations,based on the definition of granular ball model and the steps involved in the process of computing granular balls,this thesis delves deeply into the method of iterative generation of granular balls,and puts forward some strategies to improve the generation efficiency of granular ball,achieving the goal of improving the performance of granular ball computing learning model.In addition,the reduction solution based on granular ball rough set method is optimized in this thesis and proposes a fast reduction solution method based on granular ball rough set model,effectively reducing the time required for the process of reduction solution.Specifically,the research content and innovative methods of this thesis mainly include:1.An acceleration method of generating pseudo label granular balls is proposed.In the process of computing granular balls,the generation of granular balls can be regarded as an unsupervised learning process,whose termination condition is that the granular balls generated through unsupervised learning need to reach the purity calculated according to the label information of the samples in the granular ball,and the purity of the granular ball is determined by the sample label with the largest proportion in the granular ball.Therefore,when there are a large number of inconsistencies in the data,the label information of the sample itself may bring great obstacles to generate granular ball with high purity.As a result,more iterations are required for the granular ball to reach the given purity threshold,resulting in huge time consumption.In order to overcome this defect,the pseudo-label strategy is introduced into the process of calculating granular balls.Since the process of generating pseudo label can also adopt the unsupervised way,it can better fit the aggregation of samples in the granular ball,thus reducing the occurrence of inconsistent situations and finally achieving the purpose of improving the speed of generating granular balls.Finally,the results of the comparison experiment on eight benchmark data sets show that compared with the method of generating granular balls with original label,the proposed acceleration method of generating granular balls with pseudo label greatly reduces the time consumption and improves the time efficiency of generating granular balls.2.A fast reduction solution method based on granular ball rough set is proposed.The attribute reduction method based on granular ball rough set can be divided into two sub-processes.First of all,the positive region need to be obtained based on the granular balls to be generated in the process of computing granular balls.Secondly,the importance of attributes needs to be calculated based on the positive region,so that redundant or irrelevant attributes can be eliminated and necessary or relevant attributes can be retained,thereby achieving attribute reduction.The time consumption of calculating the positive region is mainly affected by the process of generating granular balls,and if the efficiency of generating granular balls can be improved,the time efficiency of calculating the positive region can be reduced.Therefore,this thesis introduces the acceleration method of generating pseudo label granular balls into the original granular ball rough set model,which can improve the time efficiency of calculating the positive region.Since the granular balls generated using the pseudo label information of samples are called pseudo label granular balls,the concept of pseudo label granular ball rough set model is proposed.A forward greedy search method is designed for attribute reduction based on pseudo label granular ball rough set model.According to the designed fast reduction solution method,the importance of attributes is considered as the metric criterion to reduce the attributes.Finally,the results of the comparison experiment on twelve benchmark data sets show that,the proposed fast reduction solution method based on granular ball rough set can not only effectively improve the time efficiency of reduction solution,but also ensure that the attributes in the reduction set have considerable classification ability.
Keywords/Search Tags:Granular ball, Pseudo label, Rough set, Attribute reduction
PDF Full Text Request
Related items