Font Size: a A A

The Research On Multi-label Feature Selection Algorithm Based On Rough Set

Posted on:2022-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y C LiFull Text:PDF
GTID:2518306509970219Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the volume of data is getting larger and larger,which gives rise to various problems.Data mining,i.e.,the process of extracting information from a large amount of data,in which high-dimensional data often have redundant or irrelevant information,requires pre-processing of the data,and the current common method is feature selection.Multi-label data is widely available in the real world,and there are more and more studies based on multi-label.Since the existence of multiple labels in multi-label data is more likely to lead to high dimensionality of the data,multi-label feature selection is an important preprocessing step in multi-label learning.In this paper,multi-label feature selection is explored in depth by means of a tool such as rough sets,and the main research is as follows.(1)Based on the fuzzy rough set model,the co-occurrence relationship between sample labels and the mutual exclusion relationship are used to evaluate the co-occurrence of samples under the label set,and then the importance of labels is obtained,the similarity of samples is calculated,the fuzzy mutual information between features and labels is defined using this relationship,and a multi-label feature selection algorithm based on the co-occurrence relationship of labels is designed by combining the principles of maximum correlation and minimum redundancy.Experiments are conducted on five publicly available datasets,and the results show the effectiveness of the proposed algorithm.(2)Combining with the neighborhood rough set and using the supervised information of the label set,a new metric positive domain method is proposed for the problem of incremental arrival of feature groups,considering the relevance of features and labels,the redundancy of features and features,and designing a new stream feature selection algorithm for incremental feature groups by such a metric of importance,and this method can reduce the data dimensionality and speed up the learning speed.The research results in this paper further enrich the research in the direction of multi-label feature selection,focusing on the relationship between labels.
Keywords/Search Tags:Multi-label, rough set, feature selection, streaming feature selection
PDF Full Text Request
Related items