Font Size: a A A

Research And Improvement Of Attribute Weighted Naive Bayes Classification Algorithm

Posted on:2024-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhouFull Text:PDF
GTID:2568307109471044Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Against the backdrop of rapid development in information technology,big data,and artificial intelligence,data has become the core of new-era production factors,laying a solid foundation for the digital economy and digital China,while also transforming into an important resource at the national strategic level.Data mining,as a decision-support technology,aims to analyze patterns and rules within data to extract potential valuable information from massive data sets.Classification technology,as a core branch of data mining,achieves data prediction and classification tasks by deeply analyzing real-world information and extracting valuable elements.The Naive Bayes classification algorithm,as one of the classification algorithms,has attracted widespread attention and application due to its simplicity,easy implementation,and high computational efficiency.Attribute weighting algorithms can effectively alleviate the attribute independence assumption of the Naive Bayes classification algorithm.Most attribute weighting algorithms based on the filtering method use a single indicator to represent data features.However,the complexity of real data distribution means that a single indicator often cannot accurately determine the classification information of data.To address this issue,this paper proposes an Adaptive Two-index Fusion Attribute Weighted Naive Bayes Algorithm(ATFNB).The algorithm constructs two data association feature indicator sets,CA and AA,for a given dataset,and selects one indicator from each of the constructed indicator sets CA and AA as the fusion indicator.To ensure the optimal performance of the fusion indicator,a tuning factorβ is introduced,and a Range Query Region Filtering(RQRF)method is proposed for quickly determining the value of the tuning factor β.The attribute weights are obtained by adjusting the ratio of the two indicators according to the tuning factor and then generating a weighted Bayesian classifier.Experimental results on the UCI dataset and the Flavia leaf dataset show that the proposed algorithm can achieve higher classification accuracy without additional time cost.The ATFNB algorithm discusses the classification performance of the Bayesian classifier in a single-view space,but real-world data has multiple forms,and extracting different views of data will yield more classification information.In response to this idea,this paper proposes a MultiView Gradient Blending Weighted Naive Bayes Algorithm(MGBWNB).KNN and SPODEs label views are constructed based on the prediction values of instances,and the original view is connected with the two label views to construct an augmented view.To avoid model overfitting,a new loss function and a gradient-based view fusion strategy are used,and the view performance is adjusted in a cyclic self-supervised manner and reflected in the weights.Experimental results on the UCI dataset show that the proposed algorithm has a significant improvement in classification performance compared to existing weighted algorithms,and ablation experiments further demonstrate the feasibility of the algorithm.In summary,attribute-weighted Naive Bayes algorithms,by assigning different weights to each attribute in a dataset,effectively reduce the impact of the attribute independence assumption in Naive Bayes,not only maintaining the simplicity,easy implementation,and efficiency of the Naive Bayes classifier,but also improving its accuracy and reliability.
Keywords/Search Tags:Data mining, Naive Bayes classification algorithm, Attribute weighting, View fusion
PDF Full Text Request
Related items