Font Size: a A A

Research On Attribute Reduction And Classification Methods Based On Local Tags

Posted on:2024-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2568307154496064Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the global data volume shows an explosive growth,and mankind has entered the era of Big data.The storage and analysis of massive data face many challenges,and at the same time,the value it contains is also infinite,gradually becoming a hot topic in academic research.How to process these data efficiently and obtain the value as much as possible is particularly important.Feature selection points out the direction for us.Feature selection,also known as feature subset selection or attribute selection,is a data preprocessing technology that can effectively reduce data dimensions and improve the performance of learning algorithms.It can bring better learning performance,higher learning accuracy,lower learning cost and better model interpretability.At present,it has been widely used in machine learning,pattern recognition and many other fields.Attribute reduction uses specific rules to select attributes from the data set.Redundant and irrelevant conditional attributes are eliminated through given constraints,and the corresponding minimal attribute subset of reduction that meets the conditions is obtained to serve the subsequent specific learning tasks.Although attribute reduction algorithms can effectively achieve dimensionality reduction,there are still some shortcomings and shortcomings in practical situations.For example,in the face of complex and ever-changing data sources and data composition,traditional attribute reduction algorithms and learners cannot effectively deal with such data,resulting in the final result being far from the preset goal.In order to improve the shortcomings of traditional attribute reduction algorithms and traditional classifiers,this thesis draws on the hierarchical cognitive pattern of human thinking and solving complex problems.Firstly,label information is used to partition the dataset,and a combination of local and global perspectives is used to further improve the performance of subsequent learners.Secondly,starting from the learner itself,the advantages and disadvantages of traditional learners in processing different data situations are analyzed and studied,and a learner that satisfies complex and diverse data is constructed to improve the performance of the learner while also having relatively stable results.Specifically,the research content and methods of this thesis mainly focus on the following two points:1.A reduction solution based on local label informationIt is not difficult to find that in the current attribute reduction strategy,there is less consideration for labels.As is well known,label information is crucial for learning tasks,and attribute reduction is a preprocessing step in data mining.The final result still needs to be returned to the relevant learning task.However,dividing the entire dataset into local datasets through labels provides us with a local perspective,allowing us to examine the relationship between attribute reduction and local labels,By using constraints based on local labels to obtain corresponding local reductions,these reductions are obtained on different labeled datasets,fully reflecting the value of labels and may provide some assistance for subsequent learning tasks.2.Classification method based on sample distributionIn our previous research,we mostly used processed and idealized datasets.The learning models trained on such datasets have satisfactory results,but it is obvious that in real life,such data almost does not exist,which is an idealized state that is difficult to achieve.We trained the model on idealized data that has been artificially processed.When encountering such data,the performance will significantly decrease,leading to a decrease in the generalization of the learning model and losing its practical application significance.Therefore,by studying traditional classification algorithms,we have combined their advantages in handling different sample distributions to better cope with the more complex and diverse data in real life.
Keywords/Search Tags:Attribute Reduction, Classification, Local Label, Rough Set
PDF Full Text Request
Related items