Font Size: a A A

Link Prediction Methods Based On Machine Learning

Posted on:2023-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:M Y JiaoFull Text:PDF
GTID:2530307091987239Subject:Mathematics
Abstract/Summary:
Predicting the missing links between the current network and the possible new links to the future network is an important task in the field of network analysis and modeling.Therefore,link prediction is one of the important research contents of complex networks,and it has significant scientific theoretical and application value.Link prediction mainly includes methods based on network structure and machine learning.Compared with the node information,the network structure is easier to obtain and more reliable,and the machine learning method has higher prediction accuracy.In this paper,the network structure and machine learning methods is improved respectively,and the corresponding link prediction methods are proposed.The main works are as follows:(1)A fusion link prediction algorithm based on non-equilibrium resource allocation index is proposed.Considering the different influence degrees of large and small nodes in the network,resulting in the unbalanced relationship between nodes in the network,the network heterogeneity index H is introduced into the RA index,and the factor of paths with length three is constructed,then the NRA index is obtained.On the other hand,due to the differences of network structures in different networks,the performance of similarity indices in different networks varies greatly.NRA index and LP index are fused to construct the ANL index.Comparative experiments are carried out in six real networks.AUC results and precision results show the stability and effectiveness of the NRA and ANL index.(2)A link prediction algorithm based on Stacking ensemble learning model is proposed.12 indices are selected from the global,local and quasi local similarity indices to form the feature set.Based on the sparsity of the network,the feature set is processed by down sampling,and the Stacking ensemble learning model is trained by the training set.The lower level is composed of three base classifiers,i.e.,logistic regression,Gradient Boosting Decision Tree and XGBoost,and their outputs are then integrated with XGBoost model in the upper level to obtain the SELLP model.Extensive experiments are carried out in six real networks,through the comparison of evaluation metrics,the effectiveness of the model and its base classifier are analyzed,and the effectiveness of the ensemble learning algorithm is verified,which can improve the prediction accuracy as a whole.(3)A link prediction algorithm of Stacking ensemble learning model based on RF-RFE feature selection is proposed.Using recursive feature elimination in random forest to select latent structural features associated with networks,a two-level Stacking ensemble model on logistic regression,XGBoost,GBDT as the base classifier and XGBoost as the two-layer classifier is constructed,and the RF-RFE-SELLP model is established.Extensive experiments are carried out on six networks.The RF-RFE-SELLP model is compared with the commonly used machine learning algorithm and similarity index algorithm in the evaluation index.The results show that RF-RFE-SELLP model can obtain higher prediction accuracy and applicability robustness.
Keywords/Search Tags:Complex network, link prediction, network structure, machine learning, Stacking
Related items